Principal Machine Learning Engineer, Mobile AI Inference Optimization

2 months, 1 week ago
Full-time
Lead
Software Development
Unity

Unity

Unity is the top platform for real-time 3D content creation, empowering creators across industries to bring their ideas to life with interactive 2D and 3D content.

Internet Software & Services
5K-10K
Founded 2004

Description

  • Set the technical vision and roadmap for deploying multi-modal AI models to iOS and Android.
  • Make decisions on model compression, quantization, pruning, and knowledge distillation to meet mobile constraints.
  • Evaluate and adopt inference runtimes such as CoreML, ONNX Runtime Mobile, TFLite, and ExecuTorch.
  • Own the end-to-end optimization pipeline from model export through graph transformation and hardware-specific kernel tuning.
  • Collaborate with research scientists to translate new model architectures into deployable mobile implementations.
  • Design scalable multi-modal inference systems that handle images, text, primitives, and metadata with real-time performance.
  • Develop approaches for dynamic resolution, token reduction, and speculative decoding optimized for mobile devices.
  • Track and adopt advances in efficient diffusion and efficient attention methods.
  • Lead and mentor ML engineers while defining best practices, code review standards, and benchmarking methodology.
  • Partner with platform, product, and runtime teams to align ML capabilities with device constraints and roadmaps.

Requirements

  • 8+ years in ML engineering, including at least 3 years focused on on-device or edge inference optimization.
  • Proven production deployment of transformer-based models and/or JAPE-style generative architectures on mobile or embedded hardware.
  • Hands-on experience with CoreML, TFLite, ONNX Runtime, and/or ExecuTorch.
  • Deep understanding of operator fusion, memory layout, and runtime scheduling.
  • Expert-level knowledge of INT8, INT4, and FP16 quantization, weight sharing, structured and unstructured pruning, and knowledge distillation.
  • Strong understanding of mobile SoC architectures including Apple Neural Engine, Qualcomm Hexagon/Adreno, and ARM Mali.
  • Proficiency in C++, Objective-C, or Swift for runtime integration, plus Python for tooling and export pipelines.
  • Ability to read, implement, and extend ML research papers, including efficient attention, diffusion samplers, and multi-modal fusion techniques.
  • Track record of technical leadership, cross-functional influence, and engineer development.
  • Experience shipping world-model or neural rendering pipelines such as NeRF or 3DGS on mobile, preferred.
  • Contributions to open-source ML inference frameworks or mobile ML research publications, preferred.
  • Familiarity with compiler stacks such as MLIR, TVM, or XLA for custom kernel generation, preferred.
  • Background in real-time graphics or game engine pipelines such as Metal, Vulkan, or OpenGL ES, preferred.
  • Strong English communication skills for frequent global collaboration.
  • International relocation support is not available for this position.

Benefits

  • Base salary range of $278,100 to $347,600 USD, depending on location and experience.
  • Comprehensive health, life, and disability insurance.
  • Employee stock ownership.
  • Competitive retirement or pension plans.
  • Generous vacation and personal days.
  • Support for new parents through leave and family-care programs.
  • Mental health and wellbeing programs and support.
  • Training and development programs.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Software Engineer - Machine Learning (Behaviors)

Motional 1K-5K Automotive

Motional’s Behaviors team is hiring an engineer to develop machine learning models that help autonomous vehicles understand and predict traffic behavior in complex real-world driving scenarios.

C++ Computer Vision Deep Learning Machine Learning Neural Networks Python PyTorch
1 day, 14 hours ago

PyTorch & MLOps AI Specialist

Weekday 11-50 Construction & Engineering

A leading AI lab’s Generative AI team is hiring an MLOps and ML Systems Engineer to support the development and evaluation of next-generation large language models and the training data that powers them.

Generative AI LLM MLOps PyTorch
1 day, 15 hours ago

Junior Python Developer - AI & Innovation Team

Adzuna 51-250 Internet Software & Services

Adzuna is hiring a Junior Python Developer to help build and maintain AI-powered jobseeker products and the production systems behind them for a remote team working in London hours.

Apache Spark AWS CSS EC2 Git GitHub HTML LLM Machine Learning MySQL Playwright PostgreSQL Python React Selenium Solr SQL Tailwind CSS
1 day, 15 hours ago

Staff AI/ML Engineer

Burq 11-50 Air Freight & Logistics

Burq is hiring a Staff AI/ML Engineer to build the core AI systems that automate logistics operations and improve real-time decision-making for the company’s delivery platform.

Computer Vision FastAPI MLOps Python Reinforcement Learning
1 day, 15 hours ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers