Staff Machine Learning Infrastructure Engineer, Simulation

1 week ago
Full-time
Senior
DevOps and Infrastructure

Waymo

Waymo is an autonomous driving technology company building the Waymo Driver and operating Waymo One, its fully autonomous ride-hailing service.

Autonomous vehicles, robotics, AI, ride-hailing / mobility tech
Founded 2009
$21600M raised

Description

  • Advance ultra-realistic multi-agent simulations using foundation models as part of a high-performing research engineering team.
  • Collaborate with Google DeepMind, Waymo Realism Modeling in London, and Waymo Oxford to improve simulation realism.
  • Provide technical leadership on large-scale ML model architectures for autonomous vehicle models.
  • Work across data engineering, model development, and deployment while guiding architectural decisions and technical direction.
  • Own large, complex systems and drive architectures that meet technical and business objectives.
  • Design and scale distributed systems across the ML lifecycle for planet-scale dataset generation and model training.
  • Partner cross-functionally to define performance and system-level requirements for large ML systems.
  • Translate product and business goals into measurable technical deliverables.
  • Mentor junior engineers and help foster a collaborative engineering culture.

Requirements

  • BS in Computer Science, Robotics, a similar technical field, or equivalent practical experience.
  • 5+ years of professional software engineering experience.
  • At least 3 years of experience in machine learning infrastructure, including developing, scaling, training, deploying, and optimizing large-scale ML systems from data to model.
  • MS in Computer Science, Robotics, a similar technical field, or equivalent practical experience (preferred).
  • 10+ years of professional software engineering experience (preferred).
  • At least 5 years of experience in machine learning infrastructure, including developing, designing, scaling, training, deploying, and optimizing large-scale ML systems from data to model (preferred).
  • Experience with ML infrastructure tools such as DeepSpeed, PyTorch, TensorFlow, or similar frameworks.
  • Strong expertise in distributed training techniques, including gradient sharding and optimization strategies for scaling large models.
  • Experience using ML accelerator profiling tools to uncover performance bottlenecks.
  • Deep understanding of modern ML models such as auto-regressive transformers and familiarity with custom kernels for hardware efficiency.
  • Practical familiarity with autonomous driving, simulations, and ML accelerators is a plus.

Benefits

  • Base salary range of £155,000 to £163,000 GBP.
  • Eligibility for Waymo’s discretionary annual bonus program.
  • Eligibility for Waymo’s equity incentive plan.
  • Access to Waymo’s generous company benefits program, subject to eligibility requirements.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Machine Learning Engineer, ML Infrastructure - Online

Unity 5K-10K Internet Software & Services

Unity Vector is seeking a senior/staff ML engineer to build and evolve its online model inference platform for production machine learning systems at scale.

Kubernetes Machine Learning Python PyTorch
1 hour, 57 minutes ago

Senior Machine Learning Engineer, AI Platform

Mozilla 251-1K Internet Software & Services

Mozilla is hiring a Machine Learning Engineer to build and operate the AI platform that powers model training, deployment, and inference for its products at global scale.

CI/CD Docker Kubernetes Machine Learning Python
1 hour, 58 minutes ago

Sr. Machine Learning Engineer

Mitek Systems 251-1K Communications Equipment

Mitek is hiring a remote Sr. Machine Learning Engineer to lead computer vision and image-based ML work for its identity verification and fraud prevention platform.

AWS CI/CD Computer Vision Docker DynamoDB Machine Learning Matplotlib MongoDB OpenCV Pandas Pillow Python PyTorch SageMaker Scikit-learn TensorFlow
4 hours, 14 minutes ago

Staff Machine Learning Engineer, ML Infrastructure - Online

Unity 5K-10K Internet Software & Services

Unity Vector is seeking a senior/staff ML engineer to build and evolve its online model inference platform for serving production machine learning models at scale.

GCP Kubernetes Machine Learning Python PyTorch
6 hours, 22 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers