Senior AI Software Developer in Test

2 months, 1 week ago
Full-time
Senior
Artificial Intelligence and Machine Learning
Caseware

Caseware

CaseWare International Inc. provides cutting-edge software solutions for accounting firms, corporations, and governments, enabling users worldwide to work smarter and transform insights into impact.

Internet Software & Services
251-1K
Founded 1988

Description

  • Evolve an AI-first quality strategy for a fast-scaling cloud-native SaaS platform and emerging agentic systems.
  • Integrate AI-enhanced testing into CI/CD pipelines, including predictive flakiness detection, automated test generation, and self-healing scripts.
  • Design deterministic and statistical testing approaches for LLM-based and agentic systems to address hallucinations, prompt injection, bias, drift, and safety risks.
  • Build automated evaluation pipelines and harnesses for correctness, faithfulness, retrieval quality, generation accuracy, tool-calling, planning sequences, and multi-agent flows.
  • Develop and execute test frameworks across the full AI lifecycle, including prompts, datasets, embeddings, model versions, RAG pipelines, and guardrails.
  • Implement red-teaming, bias and fairness checks, compliance mechanisms, and AI quality signals for automated gating and continuous monitoring.
  • Partner with product, data science, AI engineering, and development teams to test AI features and support roadmap delivery.
  • Drive quality metrics and observability, including DORA metrics, test coverage, hallucination rate, context precision, and drift detection.
  • Build dashboards, support A/B testing of models, and monitor post-deployment AI behavior.
  • Mentor SDETs, lead workshops on AI testing best practices, and help define roadmaps and standards for sustainable AI quality assurance.

Requirements

  • 7+ years of experience in Quality Engineering or SDET roles within cloud-native SaaS environments.
  • 2+ years of hands-on experience with AI, ML, or LLM systems.
  • Strong experience with automated testing infrastructure, CI/CD tools such as Jenkins or GitHub Actions, and test pyramid strategies from unit to end-to-end.
  • Full-stack testing experience across frontend, backend, and API layers.
  • Proven experience testing LLMs, AI agents, and RAG pipelines, including risks such as hallucinations, prompt injection, bias, and drift.
  • Proficiency in JavaScript or TypeScript and working knowledge of Python or Java.
  • Experience with AI evaluation frameworks such as Ragas, DeepEval, LangChain, LangSmith, or LangFuse.
  • Knowledge of observability tools such as New Relic, statistical testing methods, red-teaming, and ethical AI practices.
  • Experience with performance, stress, and load testing tools such as k6, JMeter, or BlazeMeter is nice to have.
  • Bachelor's or Master's degree in Computer Science, AI, or a related field; ISTQB AI Testing certification is a plus.
  • Strong English communication and collaboration skills.
  • A strong portfolio, open-source contributions, or relevant case studies are highly regarded.

Benefits

  • Contrato a término indefinido with all legal benefits.
  • Prepaid medicine, life insurance, and funeral assistance.
  • Internet allowance and home office stipend.
  • Competitive compensation above the market average.
  • 100% remote work environment with excellent work-life balance.
  • Budget for training and mentorship from a highly experienced professional.
  • 5 personal time off days per year, plus sick leave top-up to 100% salary from day 3 to 90.
  • Recognition awards, additional paid time off, and vacation upgrades starting at 5 years of service.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

AI Senior Product Manager

Nice Côte d'Azur Hotels, Restaurants & Leisure

NiCE is hiring a Senior Product Manager for its Analytics Platform team to define and grow internal AI services that power CXone while engaging both internal application teams and enterprise customers.

Generative AI Git LLM Prototyping
6 hours, 7 minutes ago

Dutch-Speaking GenAI Content Trust and Safety Experts - Work In Greece

Mercier Consultancy Professional Services

Mercier Consultancy MD is hiring a Dutch-speaking GenAI Content Trust and Safety Expert in Greece to help ensure AI-generated content remains safe, accurate, and compliant.

Generative AI
6 hours, 37 minutes ago

Chief Transformation and AI Officer

Penta Shipping A/S 11-50 Transportation

Penta is seeking a leader to build and run its AI and Intelligence capabilities, shaping the firm’s platform, products, and intelligence offerings that support client advisory work across business, policy, culture, and reputation.

6 hours, 37 minutes ago

Senior Consultant - Quality Engineering

3Cloud 251-1K Internet Software & Services

3Cloud is hiring a Senior Consultant – Quality Engineering to design and deliver client-facing, Azure-focused quality engineering solutions for workstreams that improve release confidence and business outcomes.

Agile Azure C# CI/CD Databricks Gatling Java JMeter K6 Locust Playwright Python Selenium
6 hours, 37 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers