Staff Machine Learning Infrastructure Engineer, Simulation

1 week, 2 days ago
Full-time
Senior
DevOps and Infrastructure

Waymo

Waymo is an autonomous driving technology company building the Waymo Driver and operating Waymo One, its fully autonomous ride-hailing service.

Autonomous vehicles, robotics, AI, ride-hailing / mobility tech
Founded 2009
$21600M raised

Description

  • Advance ultra-realistic multi-agent simulations using foundation models as part of a high-performing research engineering team.
  • Collaborate with Google DeepMind, Waymo Realism Modeling in London, and Waymo Oxford to improve simulation realism.
  • Provide technical leadership on large-scale ML model architectures for autonomous vehicle models.
  • Work across data engineering, model development, and deployment while guiding architectural decisions and technical direction.
  • Own large, complex systems and drive architectures that meet technical and business objectives.
  • Design and scale distributed systems across the ML lifecycle for planet-scale dataset generation and model training.
  • Partner cross-functionally to define performance and system-level requirements for large ML systems.
  • Translate product and business goals into measurable technical deliverables.
  • Mentor junior engineers and help foster a collaborative engineering culture.

Requirements

  • BS in Computer Science, Robotics, a similar technical field, or equivalent practical experience.
  • 5+ years of professional software engineering experience.
  • At least 3 years of experience in machine learning infrastructure, including developing, scaling, training, deploying, and optimizing large-scale ML systems from data to model.
  • MS in Computer Science, Robotics, a similar technical field, or equivalent practical experience (preferred).
  • 10+ years of professional software engineering experience (preferred).
  • At least 5 years of experience in machine learning infrastructure, including developing, designing, scaling, training, deploying, and optimizing large-scale ML systems from data to model (preferred).
  • Experience with ML infrastructure tools such as DeepSpeed, PyTorch, TensorFlow, or similar frameworks.
  • Strong expertise in distributed training techniques, including gradient sharding and optimization strategies for scaling large models.
  • Experience using ML accelerator profiling tools to uncover performance bottlenecks.
  • Deep understanding of modern ML models such as auto-regressive transformers and familiarity with custom kernels for hardware efficiency.
  • Practical familiarity with autonomous driving, simulations, and ML accelerators is a plus.

Benefits

  • Base salary range of £155,000 to £163,000 GBP.
  • Eligibility for Waymo’s discretionary annual bonus program.
  • Eligibility for Waymo’s equity incentive plan.
  • Access to Waymo’s generous company benefits program, subject to eligibility requirements.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Software Engineer II, Backend (ML Training & Serving)

Affirm 1K-5K Diversified Financial Services

Affirm is hiring a Software Engineer II for its ML Training & Serving engineering team to build the infrastructure that trains and serves machine learning models across the company.

AWS Kotlin Kubernetes Machine Learning MySQL Python
3 hours, 33 minutes ago

Ssr. Fullstack Engineer

Resilient Co 11-50 Professional Services

Resilient Co. is hiring a semi-senior Fullstack Engineer in Argentina or Brazil to build AI-driven full-stack solutions for enterprise workflows, with a focus on agentic AI, machine learning, backend services, and cloud integration.

Angular Azure C# CI/CD Django Docker Entity Framework FastAPI Flask Git JavaScript Microservices .NET NumPy Pandas Python RabbitMQ React Scikit-learn Terraform Vue.js YAML
3 hours, 48 minutes ago

[Job 29881] Senior Machine Learning Engineer, Brazil

CI&T 5K-10K Internet Software & Services

CI&T is hiring a Senior Machine Learning Engineer in Brazil to develop and deploy production ML solutions that turn data and AI capabilities into measurable business impact.

Apache Airflow Apache Spark CI/CD dbt Git Machine Learning OpenSearch Python PyTorch Scikit-learn Snowflake SQL TensorFlow XGBoost
4 hours, 3 minutes ago

AI Native Engineer

CookUnity 251-1K Hotels, Restaurants & Leisure

CookUnity is hiring a dedicated AI engineer to redesign, automate, and own high-value internal workflows across the company’s cross-functional teams.

AWS dbt Git JIRA Kotlin Linear NetSuite Notion PostgreSQL Python Snowflake SQL TypeScript Vercel
4 hours, 3 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers