[Job-29222] Tester/QA e Curadoria de IA Senior, Brazil

1 hour, 4 minutes ago
Full-time
Senior
Artificial Intelligence and Machine Learning
CI&T

CI&T

CI&T is a global digital technology agency empowering agile growth for leading companies through advanced technologies with a team of 2000 experts worldwide.

Internet Software & Services
5K-10K
Founded 1995

Description

  • Define and execute evaluation frameworks for generative AI models across relevance, fidelity, completeness, coherence, and hallucination detection.
  • Build and maintain automated and manual test suites for Copilots and AI assistants across multiple scenarios and edge cases.
  • Run comparative A/B evaluations of prompts, models, and RAG configurations to guide improvement decisions.
  • Validate generated responses against source-of-truth materials such as documents, structured data, and internal policies before and after updates.
  • Identify failure patterns, inadequate answers, and unexpected behaviors, and document issues for the development team.
  • Curate knowledge bases that support RAG solutions, ensuring indexed documents are relevant, current, and well structured.
  • Define and apply criteria for document inclusion, exclusion, updates, and versioning within knowledge bases.
  • Monitor chunk quality and vector search indexes in Azure AI Search and recommend indexing strategy adjustments when needed.
  • Operate the feedback loop for AI solutions by collecting user evaluations, analyzing conversations, and turning insights into improvements.
  • Serve as the bridge between end users and the technical team, supporting validation, homologation, and use-case discovery.

Requirements

  • Experience in testing, quality assurance, or evaluation of software or data systems.
  • Strong understanding of Generative AI, LLMs, and RAG concepts.
  • Ability to analyze logs, metrics, and behavioral patterns in AI systems.
  • Proficiency in Python for evaluation automation and quality data analysis.
  • Clear communication skills to report issues and recommend improvements to technical and non-technical audiences.
  • Organization and discipline to document processes, incidents, and decisions consistently.
  • Knowledge of responsible AI principles, AI ethics, and LGPD/GDPR.
  • Experience with LLM evaluation tools such as Azure AI Evaluation, Promptflow Evaluators, RAGAS, or TruLens is preferred.
  • Knowledge of Azure AI Foundry and its monitoring and evaluation features is preferred.
  • Familiarity with Azure AI Search and indexing strategies for RAG is preferred.
  • Experience with Microsoft Copilot Studio from a testing and validation perspective is preferred.
  • Basic knowledge of Prompt Engineering for diagnosing and tuning model behavior is preferred.
  • Experience with Power BI or similar tools for monitoring dashboards is preferred.
  • Knowledge of Microsoft’s RAI framework and Azure OpenAI responsible-use guidelines is preferred.
  • Previous experience in regulated sectors such as finance, healthcare, or legal is preferred.

Benefits

  • Health and dental insurance.
  • Meal and food allowance.
  • Childcare assistance.
  • Extended parental leave.
  • Gym and wellness partnerships through Wellhub (Gympass) and TotalPass.
  • Profit sharing (PLR).
  • Life insurance.
  • Continuous learning platform (CI&T University) and access to online course partnerships.
  • Discount club and a free online platform focused on physical, mental, and overall well-being support.
  • Language learning platform.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Statistics & Python Expert - Freelance AI Trainer

Mindrift.ai: Be the “I” in AI Internet Software & Services

Mindrift is seeking statistics specialists to contribute to project-based AI work focused on designing and validating computational mathematics and statistics problems for leading tech companies.

C MATLAB NumPy Pandas Python R SciPy SQL
39 minutes ago

Middle QA Engineer

Symphony Solutions 251-1K Internet Software & Services

BetSymphony is hiring a Middle (Strong Middle) QA Engineer to ensure the quality and reliability of its multi-brand online gambling platform across frontend and backend systems.

Agile Java JIRA Kanban Postman REST API Scrum Selenium
1 hour, 6 minutes ago

Asset Coordinator

Liquid Development 51-250 Software

Liquid Development is hiring an Asset Coordinator to support its Animation Team by managing asset flow, tracking, and coordination across artists, leads, and clients in a remote game production environment.

JIRA Monday.com Photoshop Unreal Engine
1 hour, 13 minutes ago

Statistics & Python Expert - Freelance AI Trainer

Mindrift.ai: Be the “I” in AI Internet Software & Services

Mindrift is seeking statistics specialists for part-time, project-based AI work focused on creating and validating computational mathematics problems for leading tech companies.

C MATLAB NumPy Pandas Python R SciPy SQL
1 hour, 15 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers