Unity

Unity is the top platform for real-time 3D content creation, empowering creators across industries to bring their ideas to life with interactive 2D and 3D content.

Internet Software & Services

Information Technology

5K-10K (6748)

Founded 2004

123 open positions

Links

View All Jobs

Senior Machine Learning Engineer, ML Infrastructure - Online

23 hours, 37 minutes ago

China

Full-time

Senior

Machine Learning Engineer

Software Development

Kubernetes Machine Learning Python PyTorch

Apply Now

Unity

Unity is the top platform for real-time 3D content creation, empowering creators across industries to bring their ideas to life with interactive 2D and 3D content.

Internet Software & Services

5K-10K

Founded 2004

View All Jobs 123

Description

Design and operate large-scale online inference infrastructure that serves production ML models with low latency and high reliability.
Build and improve model serving systems using frameworks such as PyTorch, Triton Inference Server, Kubernetes, GKE, Ray, or similar distributed serving technologies.
Optimize inference performance through batching, model compilation, GPU/CPU utilization improvements, request scheduling, and runtime tuning.
Develop infrastructure for model deployment, canary testing, A/B experimentation, traffic splitting, rollback, and production validation.
Improve observability for online ML systems through latency, throughput, error-rate, cost, saturation, and model-health monitoring.
Build self-healing and autoscaling capabilities to support dynamic experiment traffic and production reliability requirements.
Partner closely with ML engineers to support faster model iteration while maintaining production safety, scalability, and cost efficiency.
Improve the reliability and reproducibility of model serving workflows, including packaging, artifact validation, compatibility testing, and deployment automation.
Lead architectural improvements that make the online ML platform more robust, user-friendly, scalable, and cost-efficient.

Requirements

Strong experience building and operating production-grade online ML inference systems.
Experience with model serving frameworks such as NVIDIA Triton Inference Server, TorchServe, Ray Serve, TensorFlow Serving, or similar systems.
Experience optimizing inference workloads using dynamic batching, model compilation, quantization, GPU acceleration, GPU kernel optimization, caching, or runtime tuning.
Strong experience with distributed systems, Kubernetes, autoscaling, service reliability, and production observability.
Strong programming skills in Python, with practical experience working on production ML systems and high-scale services.
Experience with PyTorch and modern model deployment workflows, including packaging, validation, and serving lifecycle management.
Experience designing infrastructure for safe model rollout, canary testing, A/B experimentation, and automated rollback.
Strong systems thinking with the ability to reason about latency, throughput, reliability, scalability, and cost tradeoffs in online systems.
Proven ability to lead technical direction and influence architectural decisions across teams without formal authority.
Relocation support is not available for this position.
Work visa or immigration sponsorship is not available for this position.

Benefits

Comprehensive health, life, and disability insurance.
Commute subsidy.
Employee stock ownership.
Competitive retirement or pension plans.
Generous vacation and personal days.
Support for new parents through leave and family-care programs.
Mental health and wellbeing programs and support.
Training and development programs.
Office food snacks.
Employee Resource Groups.
Global Employee Assistance Program.
Volunteering and donation matching program.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Software Engineer II, Backend (ML Training & Serving)

Affirm 1K-5K Diversified Financial Services

Affirm is hiring a Software Engineer II for its ML Training & Serving engineering team to build the infrastructure that trains and serves machine learning models across the company.

Canada Full-time Junior Backend Engineer Machine Learning Engineer

$89k-$126k

AWS Kotlin Kubernetes Machine Learning MySQL Python

14 hours, 22 minutes ago

Apply

14 hours, 22 minutes ago

Ssr. Fullstack Engineer

Resilient Co 11-50 Professional Services

Resilient Co. is hiring a semi-senior Fullstack Engineer in Argentina or Brazil to build AI-driven full-stack solutions for enterprise workflows, with a focus on agentic AI, machine learning, backend services, and cloud integration.

Argentina Brazil Contract Senior Full-stack Engineer Machine Learning Engineer

Angular Azure C# CI/CD Django Docker Entity Framework FastAPI Flask Git JavaScript Microservices .NET NumPy Pandas Python RabbitMQ React Scikit-learn Terraform Vue.js YAML

14 hours, 37 minutes ago

Apply

14 hours, 37 minutes ago

[Job 29881] Senior Machine Learning Engineer, Brazil

CI&T 5K-10K Internet Software & Services

CI&T is hiring a Senior Machine Learning Engineer in Brazil to develop and deploy production ML solutions that turn data and AI capabilities into measurable business impact.