Principal Research Scientist - AI Scaling & Optimization

2 weeks, 6 days ago
Full-time
Lead
DevOps and Infrastructure
Databricks

Databricks

Databricks is the pioneering data intelligence platform, empowering organizations worldwide to solve complex data challenges with AI-driven analytics solutions.

IT Services
1K-5K
Founded 2013
$4450M raised

Description

  • Lead and grow a multidisciplinary research team focused on LLM scaling, efficiency, and systems performance.
  • Define the scaling research roadmap in alignment with Databricks’ strategic objectives.
  • Drive algorithmic innovations for large-scale neural network training and inference, including optimizers, low-precision techniques, and adaptation methods.
  • Optimize end-to-end ML systems for distributed training, reinforcement learning, memory efficiency, and compute efficiency.
  • Partner with product and engineering teams to turn research breakthroughs into production platform capabilities.
  • Oversee large-scale experiments, benchmarking, and evaluation of trade-offs in quality, latency, throughput, and cost.
  • Establish metrics, evaluation protocols, and best practices for scaling-focused research.
  • Champion responsible and robust deployment of scaling innovations.
  • Represent Databricks AI research externally through publications, talks, and collaborations with academia and open source communities.
  • Mentor and develop researchers and engineers through technical guidance and career support.

Requirements

  • Proven ability to lead a research team developing novel techniques for foundation model efficiency and related topics.
  • Strong track record of industry impact in large-scale machine learning or AI research.
  • Deep expertise in at least one of: generative AI, LLMs, distributed ML systems, model optimization, or responsible AI.
  • Strong programming skills and demonstrated ability to write high-quality, efficient code in Python and PyTorch.
  • Demonstrated ability to translate research innovation into scalable product capabilities with product and engineering teams.
  • Excellent communication, leadership, and stakeholder management skills.
  • Experience influencing cross-functional roadmaps and aligning research with business impact.
  • Prior work at the intersection of systems and ML, such as distributed training frameworks, compiler and kernel optimization, or memory-/compute-efficient model design (preferred).
  • Strong industry and academic network in large-scale ML, including collaborations or service at top conferences (preferred).
  • A strong record of research impact such as first-author publications, influential open-source contributions, or widely used deployed systems, especially in optimization or efficiency (preferred).

Benefits

  • Base salary range of $270,000 to $350,000 USD.
  • Eligibility for annual performance bonus.
  • Eligibility for equity.
  • Comprehensive benefits and perks package.
  • Opportunity to work at a company with offices around the globe.
  • Access to a remote location-based pay range and compensation determined by experience, certifications, and work location.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Principle Engineer -In Bayesian, Large Foundational Systems, and Distributional Reinforcement Learning

Airbnb 5K-10K Hotels, Restaurants & Leisure

Airbnb is hiring a Principal AI/ML Researcher and Engineer to advance probabilistic, adaptive AI systems that improve personalization, ranking, and decision-making across guest and host experiences at scale.

Apache Spark C++ Java Kafka LLM Machine Learning Python PyTorch Scala Statistics TensorFlow
16 hours, 24 minutes ago

Member of Technical Staff, AI/ML

Curai Health 51-250 Health Care Providers & Services

Curai is hiring Members of Technical Staff to design and ship applied AI/ML systems that improve patient and clinician experiences in its virtual healthcare platform.

Generative AI LLM Machine Learning Python
16 hours, 39 minutes ago

AI Safety Argumentation Platform Research Engineer

Bluesky Internet Software & Services

CARMA is hiring a remote AI Safety Argumentation Platform Research Engineer to build the evidentiary and argumentation infrastructure used to structure, verify, and communicate AI risk arguments for policymakers, researchers, journalists, and the public.

16 hours, 45 minutes ago

Senior Simulation and Modeling Engineer

Relativity Space 251-1K Aerospace & Defense

Relativity Space is hiring a Guidance, Navigation, and Control and Performance engineer to develop simulation tools and models that support Terran R flight algorithm development, analysis, and testing.

C++ CI/CD Docker Python Rust
16 hours, 54 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers