AI Evaluation Engineer - Business & Operations Domain

33 minutes ago
Contract
Mid Level
Artificial Intelligence and Machine Learning

Description

  • Design realistic business and operational workflow scenarios for AI evaluation systems.
  • Create structured tasks involving analytics, reporting, operational reasoning, and process optimization.
  • Develop clear task specifications, expected outcomes, and validation logic.
  • Identify operational edge cases, bottlenecks, and workflow failure scenarios.
  • Evaluate AI-generated outputs for reasoning quality, usefulness, and accuracy.
  • Contribute expertise across business operations, analytics, automation, or operational systems.
  • Review and improve workflow complexity, clarity, and evaluation quality.
  • Collaborate with reviewers and researchers to refine AI benchmark scenarios.
  • Help create realistic multi-step business and operational problem-solving tasks.

Requirements

  • 3–10 years of experience in operations, analytics, consulting, business systems, or related domains.
  • Strong analytical thinking and operational problem-solving skills.
  • Experience working with operational workflows, reporting systems, CRM tools, or business analytics.
  • Good understanding of cross-functional business processes and dependencies.
  • Experience with spreadsheets, dashboards, operational reporting, or workflow automation.
  • Strong written communication and documentation skills.
  • Exposure to AI systems, automation platforms, or evaluation workflows is preferred.
  • Ability to design realistic and structured operational scenarios for evaluation purposes.
  • Must be available for a contractor assignment lasting 5 weeks.
  • Must be able to work full-time (40 hours/week) or part-time (20 hours/week) with at least 4 hours of PST overlap.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Customer Experience Operations Analyst

Tines 51-250 Construction & Engineering

Tines is hiring a remote Customer Experience Operations Analyst to build and scale the operational infrastructure supporting its growing Customer Experience organization.

Salesforce SQL
1 hour, 15 minutes ago

AI Enablement & Community Lead

Coalfire 251-1K Internet Software & Services

Coalfire is hiring a remote AI Enablement & Community Lead to drive internal AI adoption by building learning programs, communities, and engagement efforts that help employees apply AI effectively across the organization.

Cybersecurity
1 hour, 34 minutes ago

AI Voice Trainer - Hindi

Wing Assistant 51-250 Professional Services

Wing Data is hiring freelance Hindi and other Indian-language voice contributors for remote, project-based AI speech training tasks involving short recorded prompts and conversations.

1 hour, 39 minutes ago

AI Training Specialist (Egocentric Video)

Toloka 251-1K Internet Software & Services

A project-based freelance opportunity with an AI training platform to record first-person videos of everyday household activities that help train AI systems and robots to understand real-world object interactions.

1 hour, 56 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers