Senior Software Engineer - Agent Evaluation

5 hours, 6 minutes ago
Part-time
Senior
Software Development
Mindrift.ai: Be the “I” in AI

Mindrift.ai: Be the “I” in AI

Join 10,000+ experts earning $15-50/hr training AI models remotely. Flexible freelance work, weekly payments. No AI experience required. Apply in 5 minutes.

Internet Software & Services

Description

  • Build realistic virtual developer environments with codebases, infrastructure, and supporting context such as tickets, documentation, and conversations.
  • Design evaluation tasks from intermediate states of those environments and define what counts as a solved solution.
  • Write functional and integration tests that verify agent solutions and distinguish correct from incorrect approaches.
  • Review agent solutions, analyze failure cases, and iterate on tasks and tests based on QA feedback.
  • Ensure tasks are solvable by an AI agent and challenging enough to evaluate strong coding models.
  • Refine evaluation criteria to make the assessment fair, robust, and neither too strict nor too lenient.

Requirements

  • 5+ years of experience in software development.
  • Experience with Python (FastAPI).
  • Experience with JavaScript/TypeScript (React).
  • Experience with Docker, Postgres, Kafka, and Redis.
  • Experience writing functional and integration tests.
  • English proficiency at B2+ level.
  • Submit CV in English and indicate your English proficiency.
  • Ability to work on a project-based, non-permanent engagement.
  • Preferred: understanding of how frontier AI coding models fail in realistic development scenarios.

Benefits

  • Up to $60/hr equivalent compensation, depending on level and pace.
  • Project-based work with flexible scheduling; you choose when and how to work.
  • Tasks are estimated at about 20 hours each.
  • Opportunity to work on leading tech company AI evaluation projects.
  • Payment is tied to completed tasks after qualification and project assignment.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Software Engineer, Auto

Upstart 1K-5K Banks

Upstart is hiring a Senior Software Engineer to help scale its Auto Direct secured lending product by building customer-facing experiences, eligibility systems, and operational workflows in a digital-first environment.

System Design
4 hours, 6 minutes ago

Software Engineer III

6sense 1K-5K IT Services

6sense is hiring a Software Engineer III to design, develop, and scale backend services and distributed systems for its AI-driven B2B account engagement platform.

AWS Azure GCP Go Java Microservices Python System Design TypeScript
4 hours, 6 minutes ago

Senior Lead Software Engineer - Developer Infrastructure

Klaviyo 1K-5K IT Services

Klaviyo is hiring a Senior Lead Software Engineer to lead backend Dev Infrastructure architecture and platform strategy for dependencies, upgrades, and developer productivity across the engineering organization.

Apache Airflow Apache Spark AWS Azure Buildkite ClickHouse Django Docker FastAPI GCP Go Jest Kafka Kubernetes MySQL PostgreSQL Python RabbitMQ React Redis Terraform TypeScript
4 hours, 6 minutes ago

Principal Software Engineer

Natera 1K-5K Pharmaceuticals

Natera is hiring a Principal Software Engineer for its R&D Platform Infrastructure team to lead architecture and delivery of cloud, workflow, and data platforms that support scientific workloads.

Apache Airflow AWS Azure Dagster Django GCP Go Groovy Helm Java Kubernetes Python React Terraform
4 hours, 6 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers