Software Engineer, Benchmarking

11 hours, 21 minutes ago
Full-time
Junior
Software Development
Epoch AI

Epoch AI

Epoch is a research institute focused on exploring critical trends and questions that influence the development and governance of artificial intelligence, providing valuable insights into its societal implications.

Professional Services
1-10
Founded 2022

Description

  • Implement AI benchmarks within the evaluation infrastructure, primarily using the Inspect library.
  • Develop and maintain the existing benchmark suite to support fast evaluation of new model releases.
  • Integrate benchmarks with AI providers and run them on Epoch AI’s infrastructure.
  • Design and develop brand new benchmarks for evaluating AI capabilities.
  • Pitch and prototype new benchmark ideas and related experiments.
  • Facilitate internal experiments and evaluation workflows.
  • Collaborate with researchers, analysts, and engineers to ensure accurate, insightful evaluation outputs are integrated into research products and publications.

Requirements

  • More than 2 years of professional software engineering experience building and maintaining complex systems.
  • Strong engineering skills with the ability to write high-quality, robust, and maintainable code.
  • Comfort working deeply within existing codebases and infrastructure.
  • Ability to generate ideas for new benchmarks, experiments, and other projects.
  • Motivation to support Epoch AI’s mission of delivering rigorous, independent AI evaluations.
  • AI domain expertise or cybersecurity experience is a strong plus but not required.
  • Hands-on experience running LLM evaluations is preferred.
  • Familiarity with evaluation frameworks like Inspect is preferred.
  • Solid grasp of current AI trends is preferred.
  • Professional-level English proficiency is required, and all application materials must be submitted in English.

Benefits

  • Annual salary of $125,000 to $200,000 USD.
  • Compensation can be paid in local currencies rather than only USD.
  • Fully remote work with flexible hours and schedules for most roles.
  • Comprehensive global health insurance, including supplemental local benefits where available and mandated.
  • Life insurance and a pension plan, if applicable in your country.
  • Generous paid time off, including 30 protected days per year, unlimited personal and sick leave, and up to 6 months of parental leave for permanent staff.
  • Flexible expense policy for equipment, productivity tools, and learning or development opportunities.
  • Paid work trips, including 3 staff retreats per year and relevant conferences.
  • Access to Berkeley offices with paid meals, snacks, a gym, and at least 20 office-access days per year for all staff.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Engineer, Applied AI

Honor 1K-5K Health Care Providers & Services

Honor Technology is hiring a Senior Engineer on its Applied AI team to build production AI systems that improve operational decision-making, workflows, and organizational learning across its aging care platform.

AWS Datadog GitHub JavaScript LLM Looker Microservices MySQL Node.js Python React
10 hours, 6 minutes ago

Sr. Application Software Engineer, Data Analytics

SpaceX 10K-50K Aerospace & Defense

SpaceX is seeking a Sr. Application Software Engineer, Data Analytics to build mission-critical analytics applications and data systems that support launch, production, and Starlink operations.

Angular C# CI/CD Computer Vision Git Grafana JavaScript Machine Learning Metabase OpenCV Pandas PostgreSQL Power BI Python PyTorch React SciPy
10 hours, 21 minutes ago

Senior Software Engineer, Realtime Imaging

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is hiring a Realtime Software Engineer to develop core real-time image processing software for infrared imaging systems used in defense applications.

C C++ CI/CD Git JIRA Linux Python Rust
10 hours, 21 minutes ago

Software Engineer

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is hiring a Full Stack Senior Software Engineer for its ArsenalOS team to build Forge, the software that supports supply chain, manufacturing, and production execution across the factory floor.

CRM ERP JavaScript Next.js React TypeScript
10 hours, 21 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers