Freelance Agent Evaluation Engineer

2 hours, 38 minutes ago
Part-time
Senior
Software Development
Mindrift.ai: Be the “I” in AI

Mindrift.ai: Be the “I” in AI

Join 10,000+ experts earning $15-50/hr training AI models remotely. Flexible freelance work, weekly payments. No AI experience required. Apply in 5 minutes.

Internet Software & Services

Description

  • Build realistic developer environments with codebases, infrastructure, and supporting context such as tickets, docs, and conversations.
  • Design tasks from intermediate environment states and define what counts as a solved solution.
  • Write tests that validate agent solutions, accepting valid approaches and rejecting incorrect ones.
  • Review agent outputs, analyze failures, and refine tasks and tests based on QA feedback.
  • Ensure tasks are solvable by an AI agent and aligned with fair, robust evaluation criteria.
  • Develop evaluation scenarios that reveal meaningful differences between strong and weak coding solutions.

Requirements

  • 5+ years of software development experience.
  • Experience with Python, including FastAPI.
  • Experience with JavaScript or TypeScript, including React.
  • Experience with Docker, Postgres, Kafka, and Redis.
  • Experience writing functional and integration tests.
  • English proficiency at B2 level or higher.
  • CV must be submitted in English.
  • Availability to complete tasks estimated at around 20 hours each, with work completed by the deadline.
  • Experience evaluating coding solutions or understanding common AI model failure modes is preferred.

Benefits

  • Up to $50/hr equivalent compensation, depending on level and pace.
  • Project-based work with no permanent employment commitment.
  • Flexible schedule with self-directed working hours.
  • Estimated 20 hours per task, allowing you to choose when and how to work.
  • Payment for completed project tasks upon acceptance.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Software Engineer

Irth 51-250 Diversified Telecommunication Services

Irth Solutions is hiring a remote Senior Software Engineer to help build and evolve its pipeline integrity SaaS platform for energy and utility operators.

Angular AWS Azure C# CSS Entity Framework Git HTML JavaScript Machine Learning .NET Node.js Oracle Power BI REST API SQL SQL Server TypeScript
2 hours, 24 minutes ago

Especialista en entrenamiento de IA (video en primera persona)

Toloka 251-1K Internet Software & Services

A remote, project-based content capture role for individuals who will record everyday activities from a first-person perspective using a mobile app to support AI and robotics training.

2 hours, 38 minutes ago

Senior iSeries Cobol Developer

TWO95 International 51-250 Internet Software & Services

Two95 International is hiring a Senior iSeries Cobol Developer for a 6+ month fully remote contract to lead application development and architecture work on enterprise systems.

Agile COBOL
2 hours, 38 minutes ago

Principal Developer

Robots & Pencils 51-250 IT Services

Robots and Pencils LP is hiring a Principal Developer to design, program, test, and maintain software for client-facing and internal products in a remote U.S.-based role.

Git HTML Network Security Prototyping
2 hours, 38 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers