Principal Machine Learning Engineer, Mobile AI Inference Optimization

1 month, 2 weeks ago
Full-time
Lead
Software Development
Unity

Unity

Unity is the top platform for real-time 3D content creation, empowering creators across industries to bring their ideas to life with interactive 2D and 3D content.

Internet Software & Services
5K-10K
Founded 2004

Description

  • Set the technical vision and roadmap for deploying multi-modal AI models to iOS and Android.
  • Make decisions on model compression, quantization, pruning, and knowledge distillation to meet mobile constraints.
  • Evaluate and adopt inference runtimes such as CoreML, ONNX Runtime Mobile, TFLite, and ExecuTorch.
  • Own the end-to-end optimization pipeline from model export through graph transformation and hardware-specific kernel tuning.
  • Collaborate with research scientists to translate new model architectures into deployable mobile implementations.
  • Design scalable multi-modal inference systems that handle images, text, primitives, and metadata with real-time performance.
  • Develop approaches for dynamic resolution, token reduction, and speculative decoding optimized for mobile devices.
  • Track and adopt advances in efficient diffusion and efficient attention methods.
  • Lead and mentor ML engineers while defining best practices, code review standards, and benchmarking methodology.
  • Partner with platform, product, and runtime teams to align ML capabilities with device constraints and roadmaps.

Requirements

  • 8+ years in ML engineering, including at least 3 years focused on on-device or edge inference optimization.
  • Proven production deployment of transformer-based models and/or JAPE-style generative architectures on mobile or embedded hardware.
  • Hands-on experience with CoreML, TFLite, ONNX Runtime, and/or ExecuTorch.
  • Deep understanding of operator fusion, memory layout, and runtime scheduling.
  • Expert-level knowledge of INT8, INT4, and FP16 quantization, weight sharing, structured and unstructured pruning, and knowledge distillation.
  • Strong understanding of mobile SoC architectures including Apple Neural Engine, Qualcomm Hexagon/Adreno, and ARM Mali.
  • Proficiency in C++, Objective-C, or Swift for runtime integration, plus Python for tooling and export pipelines.
  • Ability to read, implement, and extend ML research papers, including efficient attention, diffusion samplers, and multi-modal fusion techniques.
  • Track record of technical leadership, cross-functional influence, and engineer development.
  • Experience shipping world-model or neural rendering pipelines such as NeRF or 3DGS on mobile, preferred.
  • Contributions to open-source ML inference frameworks or mobile ML research publications, preferred.
  • Familiarity with compiler stacks such as MLIR, TVM, or XLA for custom kernel generation, preferred.
  • Background in real-time graphics or game engine pipelines such as Metal, Vulkan, or OpenGL ES, preferred.
  • Strong English communication skills for frequent global collaboration.
  • International relocation support is not available for this position.

Benefits

  • Base salary range of $278,100 to $347,600 USD, depending on location and experience.
  • Comprehensive health, life, and disability insurance.
  • Employee stock ownership.
  • Competitive retirement or pension plans.
  • Generous vacation and personal days.
  • Support for new parents through leave and family-care programs.
  • Mental health and wellbeing programs and support.
  • Training and development programs.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Software Engineer II, Backend (ML Training & Serving)

Affirm 1K-5K Diversified Financial Services

Affirm is hiring a Software Engineer II for its ML Training & Serving engineering team to build the infrastructure that trains and serves machine learning models across the company.

AWS Kotlin Kubernetes Machine Learning MySQL Python
4 hours, 38 minutes ago

Ssr. Fullstack Engineer

Resilient Co 11-50 Professional Services

Resilient Co. is hiring a semi-senior Fullstack Engineer in Argentina or Brazil to build AI-driven full-stack solutions for enterprise workflows, with a focus on agentic AI, machine learning, backend services, and cloud integration.

Angular Azure C# CI/CD Django Docker Entity Framework FastAPI Flask Git JavaScript Microservices .NET NumPy Pandas Python RabbitMQ React Scikit-learn Terraform Vue.js YAML
4 hours, 53 minutes ago

[Job 29881] Senior Machine Learning Engineer, Brazil

CI&T 5K-10K Internet Software & Services

CI&T is hiring a Senior Machine Learning Engineer in Brazil to develop and deploy production ML solutions that turn data and AI capabilities into measurable business impact.

Apache Airflow Apache Spark CI/CD dbt Git Machine Learning OpenSearch Python PyTorch Scikit-learn Snowflake SQL TensorFlow XGBoost
5 hours, 8 minutes ago

AI Native Engineer

CookUnity 251-1K Hotels, Restaurants & Leisure

CookUnity is hiring a dedicated AI engineer to redesign, automate, and own high-value internal workflows across the company’s cross-functional teams.

AWS dbt Git JIRA Kotlin Linear NetSuite Notion PostgreSQL Python Snowflake SQL TypeScript Vercel
5 hours, 8 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers