JetBrains

JetBrains

JetBrains provides cutting-edge development tools like IntelliJ IDEA and Kotlin, automating tasks to boost productivity and foster innovation.

Internet Software & Services
1K-5K
Founded 2000

Description

  • Design, implement, and maintain SFT and RL post-training pipelines for multi-step coding agents.
  • Train and adapt large language models for planning, tool use, and multi-step interactions in JetBrains IDEs.
  • Build simulation and evaluation environments where coding agents can perform and be measured on realistic developer tasks.
  • Design evaluation frameworks and metrics for agent behavior, and use traces and logs to improve training, data, and reward design.
  • Analyze training and evaluation results to improve model architectures, training recipes, and datasets.
  • Work with distributed GPU clusters and MapReduce-style infrastructure for training and data processing.
  • Collaborate with research, product, and infrastructure teams to turn product goals into models, experiments, and shipped features.

Requirements

  • Extensive hands-on experience training LLMs in pre-training, fine-tuning, or post-training settings.
  • Deep expertise in PyTorch and specialized LLM training stacks such as Megatron, NeMo, or verl.
  • Strong understanding of LLM fundamentals, including architectures, tokenization, data pipelines, batching, mixed precision, distributed training, and debugging unstable runs.
  • Ability to own projects end to end from problem definition through design, experimentation, implementation, and iteration.
  • Product-aware mindset with the ability to translate developer needs and failure modes into modeling and evaluation work.
  • At least 3 years of Python experience writing clean, maintainable code in modern ML codebases.
  • Experience with ML orchestrators or workflow tools such as Kubeflow, Dagster, Airflow, or ZenML, or schedulers like Kubernetes or SLURM (preferred).
  • Experience with large-scale data and training pipelines, including MapReduce-style clusters, multi-node GPU training, or workloads around 1M+ CPU/GPU hours (preferred).
  • Experience designing and maintaining evaluation pipelines for LLMs or agents, including metrics, dashboards, experiment tracking, and automated regression checks (preferred).
  • Experience with AI agent development, including tool-using agents, planners, or multi-step coding workflows (preferred).

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Associate Research Scientist, Evidence Generation

Precision AQ 1001-5000 Business Consulting and Services

Precision Medicine Group is hiring an Associate Research Scientist to support primary data collection studies in its Evidence Generation team for pharmaceutical and biotech clients.

R
7 hours, 23 minutes ago

Sr. Staff AI Research TLM - AI Systems

Databricks 1K-5K IT Services

Databricks is seeking a Principal Research Scientist to lead its AI Scaling team in advancing large-scale machine learning and LLM efficiency research that improves how customers train, serve, and adapt models in production.

Apache Spark Generative AI LLM MLflow Python PyTorch
1 day, 22 hours ago

Senior/Staff Deep Reinforcement Learning Engineer

DoorDash 10K-50K Air Freight & Logistics

DoorDash is hiring a Senior/Staff Deep RL Engineer to develop and deploy real-time autonomous driving policies for its DD Labs team, from problem formulation and training through on-vehicle inference.

Deep Learning Reinforcement Learning
1 day, 23 hours ago

Senior Scientist - Optical Signal Processing

STR 251-1K Aerospace & Defense

STR’s Maritime Domain Awareness Group is hiring a Senior Scientist to develop airborne passive optical signal processing algorithms for detecting and classifying low-SNR targets in cluttered national security environments.

Computer Vision MATLAB Python
2 days, 2 hours ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers