AI/ML Research Engineer, LLM Post-Training & Evaluation

5 hours, 59 minutes ago
Full-time
Mid Level
Software Development
Innodata

Innodata

Innodata Inc. is a global leader in data engineering, offering end-to-end AI solutions and platforms for businesses worldwide, combining AI and human expertise to solve complex data challenges.

IT Services
1K-5K
Founded 1988

Description

  • Lead or co-lead technically complex ML engineering projects from customer discussions through implementation and delivery.
  • Design, build, and improve LLM training and post-training pipelines, including data ingestion, preprocessing, fine-tuning, evaluation, and experiment tracking.
  • Implement and optimize evaluation systems for LLMs and multimodal models, including offline benchmarks and task-specific test harnesses.
  • Integrate human-in-the-loop and AI-augmented evaluation signals into model development workflows.
  • Build robust infrastructure and tooling for reproducible experimentation, metrics logging, and regression monitoring.
  • Diagnose model behavior and pipeline failures, including data issues, training instability, metric inconsistencies, and evaluation drift.
  • Collaborate with Language Data Scientists, Applied Research Scientists, data engineers, and customer technical stakeholders to translate evaluation frameworks into executable systems.
  • Contribute to internal research and platform development, including benchmark frameworks, evaluation tooling, and post-training workflow improvements.
  • Contribute to best practices and standards for LLM training, evaluation, and quality assurance across projects.
  • Mentor junior engineers and contribute to technical design reviews, documentation, and engineering rigor across the team.

Requirements

  • BS/MS/PhD in Computer Science, Machine Learning, AI, Applied Mathematics, or a related quantitative technical field; MS/PhD preferred.
  • 2-3 years of relevant industry or research engineering experience in ML/AI systems.
  • Hands-on experience with LLM training, fine-tuning, or post-training, including supervised fine-tuning, preference optimization, RLHF/RLAIF-style workflows, or task/domain adaptation.
  • Strong programming skills in Python and experience building production-quality ML code.
  • Experience with modern ML frameworks and tooling such as PyTorch, JAX, TensorFlow, Hugging Face, vLLM, or distributed training stacks.
  • Experience designing and implementing evaluation pipelines for LLM/ML systems, including metrics computation, dataset handling, and experiment comparisons.
  • Strong understanding of data pipelines and ML systems engineering, including reproducibility, observability, and debugging.
  • Experience with large-scale distributed ML systems and performance optimization for training/evaluation workloads, preferably in GPU or accelerator environments.
  • Experience with large-scale data processing and workflow orchestration in support of model training and evaluation.
  • Ability to collaborate directly with technical stakeholders, including research scientists, ML engineers, data engineers, and customer technical leads.
  • Strong written and verbal communication skills, including the ability to explain complex technical tradeoffs to technical and non-technical audiences.
  • Experience training, fine-tuning, and evaluating transformer-based models.
  • Understanding of post-training workflows and model iteration loops.
  • Familiarity with inference-time considerations such as latency, throughput, and memory/performance tradeoffs.
  • Experience implementing automated evaluation pipelines and test harnesses.
  • Experience with experiment tracking, versioning, and reproducibility practices.
  • Ability to assess metric quality and ensure consistency across model comparisons.
  • Proficiency in Python and strong software engineering fundamentals.
  • Experience with data processing pipelines, storage formats, and scalable dataset workflows.
  • Familiarity with CI/CD, testing, and engineering quality practices for ML systems.

Benefits

  • Expected salary range of $80,000 to $175,000 USD per year, based on experience, skills, and qualifications.
  • Opportunity to work on LLM training, post-training, and evaluation systems for foundation model builders and leading labs.
  • Work with a cross-functional team spanning language data science, applied research, data engineering, and customer technical stakeholders.
  • Contribute to internal R&D efforts on benchmark datasets, evaluation frameworks, and reusable infrastructure.
  • Help shape best practices and standards for LLM training, evaluation, and quality assurance across projects.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Product Management Consultant

Exadel 1K-5K Internet Software & Services

Exadel is hiring a Product Manager to help design and deliver AI-driven solutions for a mobility platform that improves operations, customer experience, and business decision-making.

SQL
5 hours, 29 minutes ago

AI automation & optimization specialist

ELVTR 51-250 Diversified Consumer Services

ELVTR is seeking a full-time remote AI automation and optimization specialist to help streamline internal systems, manage technical tools, and improve workflows across departments for its online education platform.

HubSpot
5 hours, 59 minutes ago

AI automation & optimization specialist

ELVTR 51-250 Diversified Consumer Services

ELVTR is seeking a full-time remote AI automation & optimization specialist to support its online education platform by improving internal processes, managing technical tools, and helping teams work more efficiently.

HubSpot
5 hours, 59 minutes ago

AI Strategy & Automation Lead

Midorick Solutions 1-10 Professional Services

A US-based fintech is hiring an AI Integration & Agentic Workflow Lead to identify, prototype, and scale AI-driven automation across the business in partnership with technical and operational teams.

LLM Python
6 hours, 14 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers