Freelance Agent Evaluation Engineer

7 hours, 54 minutes ago
Part-time
Senior
Software Development
Mindrift.ai: Be the “I” in AI

Mindrift.ai: Be the “I” in AI

Join 10,000+ experts earning $15-50/hr training AI models remotely. Flexible freelance work, weekly payments. No AI experience required. Apply in 5 minutes.

Internet Software & Services

Description

  • Build realistic developer environments with codebases, infrastructure, and supporting context such as tickets, documentation, and conversations.
  • Design evaluation tasks from intermediate states of the environments, including the prompt and clear criteria for what counts as solved.
  • Write tests that validate AI agent solutions, accepting valid approaches and rejecting incorrect ones.
  • Review agent solutions, analyze failures, and refine tasks and tests based on QA feedback.
  • Iterate on scenarios until the evaluation is fair, robust, and able to distinguish strong solutions from weak ones.
  • Help create a dataset used to evaluate how well AI coding agents handle real-world developer tasks.

Requirements

  • 5+ years of experience in software development.
  • Experience with Python, preferably FastAPI.
  • Experience with JavaScript or TypeScript, preferably React.
  • Experience with Docker, Postgres, Kafka, and Redis.
  • Experience writing functional and integration tests.
  • English proficiency at B2 level or higher.
  • Submit a CV in English and indicate your English proficiency level.
  • Ability to complete project tasks within an estimated 20-hour effort per task and meet submission deadlines.

Benefits

  • Up to $50/hr equivalent compensation, depending on level and pace.
  • Project-based work with flexible scheduling and no fixed hours.
  • Estimated ~20 hours per task, allowing you to choose when and how to work.
  • Work on AI evaluation projects for leading tech companies.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Intern, Forward Deployed Engineering

Workato 251-1K IT Services

Workato is hiring a Forward Deployed Engineering intern to support AI-driven automation initiatives by helping build intelligent agents and enterprise workflow integrations on its Agentic AI platform.

JavaScript JSON LLM Python REST API Salesforce
7 hours, 39 minutes ago

Software Engineer 3

Black Duck Inn 1K-5K Internet Software & Services

Black Duck Software is seeking a License Developer to evolve legacy licensing systems and build reliable, production-ready services for secure 24/7 customer use.

CI/CD DevSecOps Java Kubernetes Linux REST API Ruby on Rails
7 hours, 39 minutes ago

Statistical Programmer Sr

eClinical Solutions 251-1K Professional Services

Experienced Statistical Programmer role at a clinical research organization focused on delivering compliant statistical programming outputs for multiple clinical studies and regulatory submissions.

Git GitHub GitLab R SAP Shell Scripting
7 hours, 39 minutes ago

Data Conversion Software Engineer

Career TEAM 251-1K Professional Services

Career Team is hiring a Data Conversion Software Engineer to build data transformation and integration software for government-funded workforce development programs across the United States.

Agile Angular CI/CD Docker Express.js JavaScript JSON MongoDB NestJS Next.js Node.js React Scrum TypeScript XML
7 hours, 54 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers