Principal Research Scientist - AI Scaling & Optimization

10 hours, 10 minutes ago
Full-time
Lead
DevOps and Infrastructure
Databricks

Databricks

Databricks is the pioneering data intelligence platform, empowering organizations worldwide to solve complex data challenges with AI-driven analytics solutions.

IT Services
1K-5K
Founded 2013
$4450M raised

Description

  • Lead and grow a multidisciplinary research team focused on LLM scaling, efficiency, and systems performance.
  • Define the scaling research roadmap in alignment with Databricks’ strategic objectives.
  • Drive algorithmic innovations for large-scale neural network training and inference, including optimizers, low-precision techniques, and adaptation methods.
  • Optimize end-to-end ML systems for distributed training, reinforcement learning, memory efficiency, and compute efficiency.
  • Partner with product and engineering teams to turn research breakthroughs into production platform capabilities.
  • Oversee large-scale experiments, benchmarking, and evaluation of trade-offs in quality, latency, throughput, and cost.
  • Establish metrics, evaluation protocols, and best practices for scaling-focused research.
  • Champion responsible and robust deployment of scaling innovations.
  • Represent Databricks AI research externally through publications, talks, and collaborations with academia and open source communities.
  • Mentor and develop researchers and engineers through technical guidance and career support.

Requirements

  • Proven ability to lead a research team developing novel techniques for foundation model efficiency and related topics.
  • Strong track record of industry impact in large-scale machine learning or AI research.
  • Deep expertise in at least one of: generative AI, LLMs, distributed ML systems, model optimization, or responsible AI.
  • Strong programming skills and demonstrated ability to write high-quality, efficient code in Python and PyTorch.
  • Demonstrated ability to translate research innovation into scalable product capabilities with product and engineering teams.
  • Excellent communication, leadership, and stakeholder management skills.
  • Experience influencing cross-functional roadmaps and aligning research with business impact.
  • Prior work at the intersection of systems and ML, such as distributed training frameworks, compiler and kernel optimization, or memory-/compute-efficient model design (preferred).
  • Strong industry and academic network in large-scale ML, including collaborations or service at top conferences (preferred).
  • A strong record of research impact such as first-author publications, influential open-source contributions, or widely used deployed systems, especially in optimization or efficiency (preferred).

Benefits

  • Base salary range of $270,000 to $350,000 USD.
  • Eligibility for annual performance bonus.
  • Eligibility for equity.
  • Comprehensive benefits and perks package.
  • Opportunity to work at a company with offices around the globe.
  • Access to a remote location-based pay range and compensation determined by experience, certifications, and work location.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Mathematical Optimisation Engineer

NEORIS 5K-10K Internet Software & Services

NEORIS, now part of EPAM Systems, is hiring an Operations Research professional to develop optimization models and production-ready scheduling and planning solutions for operations and manufacturing environments.

CI/CD Git Python
46 minutes ago

Principal Algorithm & Signal Processing Engineer

STR 251-1K Aerospace & Defense

STR is hiring a Principal Engineer to lead the development of advanced signal processing and decision-making algorithms for radar and electronic warfare programs serving defense and national security missions.

Deep Learning Machine Learning MATLAB MLflow NLP Python PyTorch Reinforcement Learning TensorFlow
6 hours, 5 minutes ago

Cleared Vulnerability Research Engineer

Bugcrowd 1K-5K Internet Software & Services

Bugcrowd is hiring a cleared exploit development specialist to independently build novel vulnerability discovery and exploitation capabilities for real-world targets in a remote role with travel to customer sites.

Assembly C Python
6 hours, 43 minutes ago

Lead Quantum/HPC Integration Engineer - US

Alice & Bob 11-50 Internet Software & Services

Alice & Bob is hiring a Lead Quantum/HPC Integration Engineer to help integrate its fault-tolerant quantum processors with production HPC systems and build open-source software for scientific computing centers.

C C++ Fortran Julia Python Rust
10 hours, 10 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers