Staff Machine Learning Engineer, Offline Infrastructure

1 week, 3 days ago
Full-time
Lead
Software Development
Unity

Unity

Unity is the top platform for real-time 3D content creation, empowering creators across industries to bring their ideas to life with interactive 2D and 3D content.

Internet Software & Services
5K-10K
Founded 2004

Description

  • Design and operate large-scale data pipelines that generate training datasets for machine learning training and experimentation.
  • Develop infrastructure that supports distributed training workflows using tools such as PyTorch, Ray Data, and Ray Train.
  • Integrate ML pipelines with workflow orchestration systems such as Flyte, Airflow, or similar platforms.
  • Improve reproducibility and observability through dataset validation, monitoring, and automated testing.
  • Optimize performance and resource utilization across distributed compute systems for data processing and model training.
  • Partner closely with ML engineers to support large-scale experimentation and model iteration.
  • Lead architectural improvements to keep offline ML pipelines scalable, reliable, and cost-efficient.

Requirements

  • Strong experience building large-scale ML pipelines.
  • Experience with distributed computing frameworks such as Ray, Spark, or Flink, including familiarity with the Ray ecosystem (Ray Data, Ray Train).
  • Experience building infrastructure for training data generation, dataset preparation, or ML feature pipelines.
  • Deep experience designing and operating production-grade data pipelines.
  • Strong programming skills in Python and experience with large-scale distributed workloads.
  • Experience with modern data infrastructure, including data lakes, data warehouses, orchestration systems, and streaming platforms.
  • Strong systems thinking with the ability to reason about performance, scalability, reliability, and cost tradeoffs in distributed systems.
  • Proven ability to lead technical direction and influence architectural decisions across teams without formal authority.
  • Strong knowledge of English for frequent professional verbal and written communication with global colleagues and partners.
  • Experience with PyTorch, Flyte, or Airflow is preferred.

Benefits

  • Gross pay salary of $209,700 to $283,800 USD.
  • Comprehensive health, life, and disability insurance.
  • Commute subsidy.
  • Employee stock ownership.
  • Competitive retirement or pension plans.
  • Generous vacation and personal days.
  • Support for new parents through leave and family-care programs.
  • Mental health and wellbeing programs and support.
  • Training and development programs.
  • Volunteering and donation matching program.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Autonomy Software Engineer

STR 251-1K Aerospace & Defense

STR’s Sensors Division is hiring an Autonomy Software Engineer for the SAAM group to develop and integrate mission autonomy software and support testing for defense-focused sensing and counter-sensing systems.

C++ CI/CD Embedded Systems Git iOS Linux Machine Learning MATLAB
27 minutes ago

Autonomy Software Engineer

STR 251-1K Aerospace & Defense

STR’s Sensors Division is seeking an Autonomy Software Engineer to develop and integrate real-time mission systems and autonomy capabilities for defense and national security applications.

C++ CI/CD Git iOS Linux Machine Learning MATLAB
27 minutes ago

Co-founder & Chief Technology Officer - AI ROI Measurement Platform

FutureSight 11-50 Internet Software & Services

FutureSight is seeking a Co-Founder & CTO to build and lead a new AI ROI measurement and governance platform for enterprise customers from the ground up.

GitHub JIRA LLM
1 hour, 4 minutes ago

Sagemaker DevOps Engineer - Europe

Xenon7 Internet Software & Services

Xenon7 is hiring a remote Sagemaker DevOps Engineer in Europe to build and automate enterprise-scale ML infrastructure and deployment workflows for clients across cutting-edge IT projects.

AWS CI/CD Docker Jenkins MLOps Python
1 hour, 8 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers