Caseware

CaseWare International Inc. provides cutting-edge software solutions for accounting firms, corporations, and governments, enabling users worldwide to work smarter and transform insights into impact.

Internet Software & Services

Information Technology

251-1K (420)

Founded 1988

22 open positions

Links

View All Jobs

Senior AI Software Developer in Test

2 hours, 4 minutes ago

Colombia

Full-time

Senior

AI (Artificial Intelligence)

Artificial Intelligence and Machine Learning

CI/CD GitHub Actions Java JavaScript Jenkins JMeter K6 LLM Machine Learning New Relic Python TypeScript

Apply Now

Caseware

CaseWare International Inc. provides cutting-edge software solutions for accounting firms, corporations, and governments, enabling users worldwide to work smarter and transform insights into impact.

Internet Software & Services

251-1K

Founded 1988

View All Jobs 22

Description

Evolve an AI-first quality strategy for a fast-scaling cloud-native SaaS platform and emerging agentic systems.
Integrate AI-enhanced testing into CI/CD pipelines, including predictive flakiness detection, automated test generation, and self-healing scripts.
Design deterministic and statistical testing approaches for LLM-based and agentic systems to address hallucinations, prompt injection, bias, drift, and safety risks.
Build automated evaluation pipelines and harnesses for correctness, faithfulness, retrieval quality, generation accuracy, tool-calling, planning sequences, and multi-agent flows.
Develop and execute test frameworks across the full AI lifecycle, including prompts, datasets, embeddings, model versions, RAG pipelines, and guardrails.
Implement red-teaming, bias and fairness checks, compliance mechanisms, and AI quality signals for automated gating and continuous monitoring.
Partner with product, data science, AI engineering, and development teams to test AI features and support roadmap delivery.
Drive quality metrics and observability, including DORA metrics, test coverage, hallucination rate, context precision, and drift detection.
Build dashboards, support A/B testing of models, and monitor post-deployment AI behavior.
Mentor SDETs, lead workshops on AI testing best practices, and help define roadmaps and standards for sustainable AI quality assurance.

Requirements

7+ years of experience in Quality Engineering or SDET roles within cloud-native SaaS environments.
2+ years of hands-on experience with AI, ML, or LLM systems.
Strong experience with automated testing infrastructure, CI/CD tools such as Jenkins or GitHub Actions, and test pyramid strategies from unit to end-to-end.
Full-stack testing experience across frontend, backend, and API layers.
Proven experience testing LLMs, AI agents, and RAG pipelines, including risks such as hallucinations, prompt injection, bias, and drift.
Proficiency in JavaScript or TypeScript and working knowledge of Python or Java.
Experience with AI evaluation frameworks such as Ragas, DeepEval, LangChain, LangSmith, or LangFuse.
Knowledge of observability tools such as New Relic, statistical testing methods, red-teaming, and ethical AI practices.
Experience with performance, stress, and load testing tools such as k6, JMeter, or BlazeMeter is nice to have.
Bachelor's or Master's degree in Computer Science, AI, or a related field; ISTQB AI Testing certification is a plus.
Strong English communication and collaboration skills.
A strong portfolio, open-source contributions, or relevant case studies are highly regarded.

Benefits

Contrato a término indefinido with all legal benefits.
Prepaid medicine, life insurance, and funeral assistance.
Internet allowance and home office stipend.
Competitive compensation above the market average.
100% remote work environment with excellent work-life balance.
Budget for training and mentorship from a highly experienced professional.
5 personal time off days per year, plus sick leave top-up to 100% salary from day 3 to 90.
Recognition awards, additional paid time off, and vacation upgrades starting at 5 years of service.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

*Scout Search Quality Rater - Portuguese (Brazil)

Welocalize 1K-5K Professional Services

Welocalize is hiring a freelance, remote Search Quality Rater in Brazil to evaluate Portuguese (Brazil) search results and help improve AI-driven search experience through project-based quality ratings.

Brazil Contract Entry Level AI (Artificial Intelligence) Manual QA Tester

4 minutes ago

Apply

4 minutes ago

AI Game Tech, Technical Director

Skydance 251-1K Media

Skydance Games is seeking a Technical Director to lead AI-focused game technology research, prototyping, and production integration across multiple game teams.

United States Full-time Lead AI (Artificial Intelligence)

$230k-$250k

Game Development Machine Learning Unreal Engine

4 minutes ago

Apply

4 minutes ago

Scout Search Quality Rater - English (UK)

Welocalize 1K-5K Professional Services

Welocalize is hiring a remote Freelance Search Quality Rater in the United Kingdom to evaluate search results and help improve AI training data for a client project.

United Kingdom Freelance Entry Level AI (Artificial Intelligence)

Machine Learning NLP

4 minutes ago

Apply

4 minutes ago

Shape the Future of AI — Marathi Talent Hub

Welocalize 1K-5K Professional Services

Welo Data, part of Welocalize, is seeking Marathi-speaking contributors in India to join a global remote talent network for flexible AI data projects involving annotation, evaluation, and prompt creation.

Anywhere Europe India Freelance Entry Level AI (Artificial Intelligence) Data Annotator

LLM

4 minutes ago

Apply

4 minutes ago

Caseware

Tags

Links

Senior AI Software Developer in Test

Caseware

Description

Requirements

Benefits

Similar Roles

*Scout Search Quality Rater - Portuguese (Brazil)

AI Game Tech, Technical Director

Scout Search Quality Rater - English (UK)

Shape the Future of AI — Marathi Talent Hub

You're on a roll! Sign up now to keep applying.