Sr. Staff AI Research TLM - AI Systems

1 month ago
Full-time
Lead
DevOps and Infrastructure
Databricks

Databricks

Databricks is the pioneering data intelligence platform, empowering organizations worldwide to solve complex data challenges with AI-driven analytics solutions.

IT Services
1K-5K
Founded 2013
$4450M raised

Description

  • Lead and grow a multidisciplinary research team focused on LLM scaling, efficiency, and systems performance.
  • Define the scaling research roadmap aligned with Databricks’ strategic objectives.
  • Drive algorithmic innovations for large-scale training and inference, including optimizers, low-precision techniques, and model adaptation methods.
  • Design and run large-scale experiments and benchmark new methods against state-of-the-art approaches.
  • Optimize distributed training, parallelism, memory efficiency, and compute efficiency in collaboration with systems and infrastructure teams.
  • Work hands-on in Python and PyTorch to prototype research ideas and integrate them into production systems.
  • Establish metrics, evaluation protocols, and best practices for scaling-focused research across Databricks AI.
  • Partner with product and engineering leaders to translate research breakthroughs into customer-facing platform capabilities.
  • Champion responsible deployment, ensuring reliability, safety, and model behavior remain first-class considerations.
  • Mentor and develop researchers and engineers through technical guidance and career support.

Requirements

  • Proven ability to lead a research team developing novel techniques for foundation model efficiency with strong industry impact.
  • Deep expertise in at least one of: generative AI, LLMs, distributed ML systems, model optimization, or responsible AI.
  • Strong programming skills with demonstrated ability to write high-quality, efficient code in Python and PyTorch.
  • Experience translating research innovations into scalable product capabilities in partnership with product and engineering teams.
  • Excellent communication, leadership, and stakeholder management skills.
  • Prior work at the intersection of systems and ML, such as distributed training frameworks, compiler and kernel optimization, or memory-/compute-efficient model design (preferred).
  • Strong industry and academic network in large-scale ML, with collaborations or service at top conferences such as PC or area chair roles (preferred).
  • A strong record of research impact, such as first-author publications at ICLR, ICML, NeurIPS, or MLSys, influential open-source contributions, or widely used deployed systems (preferred).

Benefits

  • Annual performance bonus eligibility.
  • Equity eligibility.
  • Competitive local pay range of $270,000 to $340,000 USD.
  • Comprehensive benefits and perks package.
  • Region-specific benefits details provided by Databricks.
  • Commitment to a diverse and inclusive workplace.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Multidisciplinary Analysis & Simulation Engineer

Relativity Space 251-1K Aerospace & Defense

Relativity Space is hiring an Integrated Performance engineer to help design, analyze, and optimize Terran R vehicle and mission performance across the launch system.

Docker Julia MATLAB Python
4 hours, 31 minutes ago

Senior Vehicle Simulation Engineer

Relativity Space 251-1K Aerospace & Defense

Relativity Space is hiring an Integrated Performance team member to help develop and analyze Terran R vehicle and mission designs across simulation, trajectory optimization, and cross-functional engineering decisions.

Docker Julia MATLAB Python
4 hours, 31 minutes ago

Vehicle Simulation Engineer II

Relativity Space 251-1K Aerospace & Defense

Relativity Space is hiring an Integrated Performance engineer to support Terran R launch system development by integrating design data, running mission simulations, and informing vehicle and trajectory decisions.

Docker Julia MATLAB Python
4 hours, 46 minutes ago

Senior Research Data Engineer (Canada)

PointClickCare 1K-5K Health Care Providers & Services

PointClickCare is hiring a Senior Applied Research Engineer to build and maintain the gold data layer that powers AI model development on clinical and operational healthcare data.

Apache Airflow Apache Spark AWS Azure CI/CD Dagster Databricks dbt Git HIPAA MLflow Prefect Python PyTorch SQL
5 hours, 31 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers