Lead Machine Learning Engineer, Inference & Performance

1 hour, 35 minutes ago
Full-time
Senior
Software Development
Egen.ai

Egen.ai

Egen.ai specializes in providing technology services that leverage cloud computing, data analytics, and artificial intelligence to enhance document intelligence and drive productivity and growth for its clients.

IT Services
Founded 2000

Description

  • Own the full lifecycle of AI features from initial prototype to robust, scalable production services.
  • Design and optimize production LLM serving to maximize throughput and minimize latency.
  • Instrument and profile training runs to identify bottlenecks and improve performance.
  • Tune attention implementations and other inference/training techniques for specific hardware platforms.
  • Deploy and operate multiple models within shared GPU clusters on Google Kubernetes Engine (GKE).
  • Improve GPU utilization, throughput-per-dollar, and overall fleet efficiency.
  • Collaborate with clients to translate business needs and constraints into AI architectures.
  • Write clean, maintainable code and apply a disciplined software engineering approach to AI systems.

Requirements

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
  • 5+ years of experience in ML/AI engineering, with meaningful experience in performance, infrastructure, or systems.
  • Proven track record of deploying and optimizing models in a production environment.
  • Demonstrated experience profiling and improving GPU utilization for training and/or inference.
  • Hands-on experience with vLLM, SGLang, or comparable high-performance serving stacks.
  • Strong Kubernetes experience, specifically deploying and autoscaling multiple models on shared GPU clusters on Google Cloud/GKE.
  • Mastery of Python and shell scripting.
  • Solid grasp of GPU architecture, LLM inference fundamentals, and the attention mechanism.
  • Fluency with profiling tools for diagnosing compute-bound and memory-bound bottlenecks.
  • Knowledge of data engineering and SQL.
  • Experience with classic machine learning, neural nets, training, and tuning is a strong plus.
  • Comfort reading lower-level CUDA-adjacent performance code is a strong plus.

Benefits

  • Competitive salary.
  • Comprehensive health insurance.
  • Paid leave, including vacation/PTO.
  • Paid holidays.
  • Sick leave.
  • Parental leave.
  • Bereavement leave.
  • 401(k) employer match.
  • Employee referral bonuses.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Manager, Data/ML Platform

Inspiren 11-50 Internet Software & Services

Inspiren is seeking a Senior Manager to lead its Data + ML Platform group, building the core infrastructure that turns multimodal resident and care data into real-time insights and powers analytics, ML, and next-generation product capabilities.

LLM Machine Learning MLOps
1 hour, 5 minutes ago

Lead AI Integration Engineer

3Pillar Global 1K-5K Internet Software & Services

3Pillar is hiring a Lead, AI Engineering to build and deliver production-grade AI and agentic systems for enterprise clients across industries, with a focus on backend integration, retrieval workflows, and responsible deployment.

Agile Azure CI/CD LLM Matplotlib MLflow Plotly Python PyTorch Scikit-learn Seaborn TensorFlow
1 hour, 35 minutes ago

Data & Machine Learning Engineer

IDT 1K-5K Diversified Telecommunication Services

IDT is hiring a LATAM-based Data/ML Engineer to join its BI team and build the data pipeline, architecture, and AI-enabled analytics systems behind its warehouse, LLM applications, and AI-driven business intelligence.

Agile Apache Spark AWS CI/CD Hadoop JSON Kafka Linux MLOps Python REST API Snowflake SQL Unix
1 hour, 35 minutes ago

Senior Backend Software Developer, ML Platform

Coveo 251-1K Internet Software & Services

Coveo is hiring a Senior Backend Software Developer for its ML Platform team to build the infrastructure and tooling that support machine learning experimentation, validation, and deployment.

CI/CD Java Machine Learning MLflow MLOps Python Terraform
1 hour, 50 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers