Principal Machine Learning Engineer, Mobile AI Inference Optimization

1 hour, 25 minutes ago
Full-time
Lead
Software Development
Unity

Unity

Unity is the top platform for real-time 3D content creation, empowering creators across industries to bring their ideas to life with interactive 2D and 3D content.

Internet Software & Services
5K-10K
Founded 2004

Description

  • Set the technical vision and roadmap for deploying multi-modal AI models to iOS and Android.
  • Make decisions on model compression, quantization, pruning, and knowledge distillation to meet mobile constraints.
  • Evaluate and adopt inference runtimes such as CoreML, ONNX Runtime Mobile, TFLite, and ExecuTorch.
  • Own the end-to-end optimization pipeline from model export through graph transformation and hardware-specific kernel tuning.
  • Collaborate with research scientists to translate new model architectures into deployable mobile implementations.
  • Design scalable multi-modal inference systems that handle images, text, primitives, and metadata with real-time performance.
  • Develop approaches for dynamic resolution, token reduction, and speculative decoding optimized for mobile devices.
  • Track and adopt advances in efficient diffusion and efficient attention methods.
  • Lead and mentor ML engineers while defining best practices, code review standards, and benchmarking methodology.
  • Partner with platform, product, and runtime teams to align ML capabilities with device constraints and roadmaps.

Requirements

  • 8+ years in ML engineering, including at least 3 years focused on on-device or edge inference optimization.
  • Proven production deployment of transformer-based models and/or JAPE-style generative architectures on mobile or embedded hardware.
  • Hands-on experience with CoreML, TFLite, ONNX Runtime, and/or ExecuTorch.
  • Deep understanding of operator fusion, memory layout, and runtime scheduling.
  • Expert-level knowledge of INT8, INT4, and FP16 quantization, weight sharing, structured and unstructured pruning, and knowledge distillation.
  • Strong understanding of mobile SoC architectures including Apple Neural Engine, Qualcomm Hexagon/Adreno, and ARM Mali.
  • Proficiency in C++, Objective-C, or Swift for runtime integration, plus Python for tooling and export pipelines.
  • Ability to read, implement, and extend ML research papers, including efficient attention, diffusion samplers, and multi-modal fusion techniques.
  • Track record of technical leadership, cross-functional influence, and engineer development.
  • Experience shipping world-model or neural rendering pipelines such as NeRF or 3DGS on mobile, preferred.
  • Contributions to open-source ML inference frameworks or mobile ML research publications, preferred.
  • Familiarity with compiler stacks such as MLIR, TVM, or XLA for custom kernel generation, preferred.
  • Background in real-time graphics or game engine pipelines such as Metal, Vulkan, or OpenGL ES, preferred.
  • Strong English communication skills for frequent global collaboration.
  • International relocation support is not available for this position.

Benefits

  • Base salary range of $278,100 to $347,600 USD, depending on location and experience.
  • Comprehensive health, life, and disability insurance.
  • Employee stock ownership.
  • Competitive retirement or pension plans.
  • Generous vacation and personal days.
  • Support for new parents through leave and family-care programs.
  • Mental health and wellbeing programs and support.
  • Training and development programs.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Principal Machine Learning Integration Engineer

Motional 1K-5K Automotive

Motional is hiring an engineer to deploy and optimize machine learning–based motion planning and control for real-time autonomous driving on safety-critical vehicle platforms.

C++ CI/CD Machine Learning Python PyTorch Reinforcement Learning TensorFlow
10 minutes ago

ML Systems Engineer, ML Acceleration

Motional 1K-5K Automotive

Motional is seeking a Machine Learning Systems Engineer for its ML Acceleration team to improve the systems that power large-scale model training for autonomous vehicle research and development.

Machine Learning Python PyTorch
25 minutes ago

Fellow - Autonomy (Distinguished Engineer)

Motional 1K-5K Automotive

Motional is hiring a distinguished Machine Learning Engineer on its Autonomy team to develop machine-learning-driven features for autonomous driving systems focused on prediction, planning, and control.

Computer Vision Deep Learning LLM Machine Learning PyTorch TensorFlow Transformers
25 minutes ago

Machine Learning Engineer (Infra), Driver Understanding and Evaluation

Waymo Autonomous vehicles, robotics, AI, ride-hailing / mobility tech

Waymo is hiring a Machine Learning engineer or researcher for its DUE team to build scalable ML and data systems that improve evaluation, simulation workflows, and developer tooling for autonomous driving.

Machine Learning PyTorch TensorFlow
34 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers