AI Engineer, Evaluation - 11319

4 hours, 36 minutes ago
Mid Level
Software Development
Coupa Software

Coupa Software

Coupa Software is the premier cloud-based finance platform, empowering companies worldwide to optimize spend, boost profits, and reduce costs with a comprehensive suite of modules.

Internet Software & Services
1K-5K
Founded 2006

Description

  • Build and maintain automated evaluation pipelines for AI model quality.
  • Implement task-specific benchmarks and test suites.
  • Design dashboards that track accuracy, regression, and safety metrics.
  • Implement automated regression testing for every model iteration.
  • Build comparison frameworks for side-by-side evaluation of model variants.
  • Analyze evaluation results to identify failure modes and report findings to the ML team.
  • Maintain evaluation datasets, including versioning, quality validation, and coverage analysis.
  • Support A/B testing infrastructure for production model validation.

Requirements

  • 3+ years of software engineering experience.
  • Proficiency in Python.
  • Experience with statistical analysis and data visualization.
  • Understanding of ML model evaluation concepts such as precision, recall, F1, and human evaluation.
  • Experience building automated test frameworks and CI/CD pipelines.
  • Familiarity with dashboarding tools.
  • Strong analytical and problem-solving skills.
  • BS in Computer Science, Statistics, or equivalent experience.

Benefits

  • Remote work option.
  • Inclusive and welcoming work environment.
  • Opportunity to work on AI technology with global impact.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Data & AI Engineer

RevStar 51-250 Internet Software & Services

RevStar is hiring an experienced Data & AI Engineer to lead client-facing delivery of AWS-based GenAI and data platform quickstarts, building production-ready AI/ML solutions that create business impact.

Agile AWS CI/CD DynamoDB Generative AI Git Machine Learning MLOps OpenSearch Python Serverless
50 minutes ago

Product & Technical Co-founder (CPTO) - AI Compliance Platform

FutureSight 11-50 Internet Software & Services

FutureSight is seeking a Co-Founder & CPTO to build an explainable AI compliance platform for U.S. RIAs, broker-dealers, and wealth fintechs, taking the product from prototype to audit-ready production and market expansion.

Encryption LLM
1 hour, 28 minutes ago

Mid Level/Senior Developer IA (aplicada ao SDLC), Brasil

CI&T 5K-10K Internet Software & Services

CI&T is hiring a Mid Level/Senior Developer focused on applying AI within the SDLC for a financial-services project in Brazil, working from home with on-site attendance required for candidates in the Campinas metro area.

CI/CD Git Java .NET Node.js Python TypeScript
1 hour, 42 minutes ago

Fullstack Engineer II - Conversational Analytic

Spotify Media

Spotify’s Platform team is hiring a Fullstack Engineer II in Stockholm to build AI-native conversational analytics experiences that help internal teams explore and act on data more intuitively.

LLM Machine Learning NLP
3 hours, 15 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers