AssemblyAI

AssemblyAI is a leading provider of AI models for transcribing and understanding speech. Their Speech AI models offer accurate speech-to-text conversion, speaker detection, sentiment analysis, and more, enabling users to extract valuable insights from ...

Media

Consumer Discretionary

51-250 (90)

Founded 2017

$63M raised

5 open positions

Links

View All Jobs

Research Engineer, Evaluations

3 weeks, 4 days ago

United States

Full-time

Senior

Research Scientist

Software Development

LLM Machine Learning Python SQL

Apply Now

AssemblyAI

Media

51-250

Founded 2017

$63M raised

View All Jobs 5

Description

Own end-to-end and integration-level model evaluation across accuracy, latency, and feature-specific metrics.
Build and maintain competitive benchmarking pipelines against other providers in the market.
Design and run systematic experiments to measure the impact of model changes.
Onboard, curate, and maintain evaluation datasets, including public benchmarks and internal test sets.
Create evaluation subsets that stress-test specific capabilities and edge cases.
Define evaluation metrics that capture real-world performance.
Translate qualitative customer feedback into quantifiable evaluation criteria.
Work with customer-facing teams to understand pain points and turn them into research priorities.
Maintain clean evaluation pipelines and clear documentation to reduce friction for researchers.
Proactively identify evaluation gaps and propose solutions.

Requirements

Understanding of machine learning fundamentals, including how models are trained and evaluated.
Strong Python skills for writing evaluation scripts and working with data pipelines.
Comfort working with SQL and cloud infrastructure.
Strong intuition for evaluation metrics, including relative vs. absolute improvements and statistical rigor.
Familiarity with the voice agent stack, including VAD, ASR, turn detection, LLM, and TTS.
Ability to communicate technical results to researchers, leadership, and customer-facing teams.
Ownership mindset and ability to independently identify and fill evaluation gaps.
Tinkerer mentality and willingness to ship rough versions and iterate quickly.
Availability to work at least 3-4 hours overlapping with Eastern US time zone.
Experience with speech/audio ML or real-time systems (preferred).
Hands-on experience with voice agent orchestrators such as LiveKit, Pipecat, or Vapi (preferred).
Familiarity with standard ML evaluation practices and benchmarks (preferred).
Experience working with customer-facing or product teams (preferred).
Background in QA, data science, or applied ML roles (preferred).

Benefits

Salary range of $210,000 - $260,000.
Competitive compensation structure with opportunities for additional rewards and benefits.
Opportunity to work at a small, high-growth company with outsized ownership and impact.
A lean environment with fewer layers of bureaucracy and faster decision-making.
Exposure to meaningful scale in a proven business serving major customers.
Commitment to pay equity and consideration of relevant experience and qualifications in compensation decisions.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Principle Engineer -In Bayesian, Large Foundational Systems, and Distributional Reinforcement Learning

Airbnb 5K-10K Hotels, Restaurants & Leisure

Airbnb is hiring a Principal AI/ML Researcher and Engineer to advance probabilistic, adaptive AI systems that improve personalization, ranking, and decision-making across guest and host experiences at scale.

United States Full-time Lead Research Scientist Technical Lead

$296k-$370k

Apache Spark C++ Java Kafka LLM Machine Learning Python PyTorch Scala Statistics TensorFlow

22 hours, 23 minutes ago

Apply

22 hours, 23 minutes ago

Senior Simulation and Modeling Engineer

Relativity Space 251-1K Aerospace & Defense

Relativity Space is hiring a Guidance, Navigation, and Control and Performance engineer to develop simulation tools and models that support Terran R flight algorithm development, analysis, and testing.

United States Full-time Senior Research Scientist Software Engineer

$148k-$204k

C++ CI/CD Docker Python Rust

22 hours, 53 minutes ago

Apply

22 hours, 53 minutes ago

Senior Scraping Engineer (Web scraping & Anti-bot)

Infatica 1-10 Internet Software & Services

Infatica.io is seeking an experienced Tech Engineer to help build and lead the architecture of a high-load web scraping platform that delivers clean HTML or structured JSON outputs for cloud and on-premises deployments.

Cyprus Full-time Senior DevOps Engineer Research Scientist

CI/CD Cloudflare Docker Go Grafana Helm HTTP Kubernetes Microservices Playwright Prometheus Puppeteer Python Redis Selenium TLS

1 day, 23 hours ago

Apply

1 day, 23 hours ago

Health Science Research Intern

OURA 251-1K Health Care Providers & Services

Oura is hiring a remote U.S. Health Science Research Intern to support clinical and real-world evidence research by contributing to study design, documentation, and data-driven insights for its Health Science team.

United States Internship Entry Level Data Scientist Research Scientist

$94k-$125k

Python R SQL

2 days, 13 hours ago

Apply

2 days, 13 hours ago

AssemblyAI

Tags

Links

Research Engineer, Evaluations

AssemblyAI

Description

Requirements

Benefits

Similar Roles

Principle Engineer -In Bayesian, Large Foundational Systems, and Distributional Reinforcement Learning

Senior Simulation and Modeling Engineer

Senior Scraping Engineer (Web scraping & Anti-bot)

Health Science Research Intern

You're on a roll! Sign up now to keep applying.