Pathway Genomics

Pathway Genomics

Pathway Genomics is a global leader in genetic testing and personalized healthcare, integrating AI and deep learning for actionable precision health information worldwide.

Health Care Providers & Services
51-250
Founded 2008
$40M raised

Description

  • Design, operate, and scale GPU and CPU clusters for ML training and inference using tools such as Slurm, Kubernetes, autoscaling, queueing, and quota management.
  • Automate infrastructure provisioning and configuration with infrastructure-as-code and configuration management tools.
  • Build and maintain ML pipelines for data ingestion, training, evaluation, deployment, reproducibility, traceability, and rollback.
  • Implement and evolve ML-focused CI/CD pipelines for testing, packaging, and deploying models and services.
  • Own monitoring, logging, and alerting for training and serving workloads, including utilization, latency, throughput, failures, and data or model drift.
  • Work with terabyte-scale datasets and solve related storage, networking, and performance challenges.
  • Partner with ML engineers and researchers to productionize experimental work into robust, scalable systems.
  • Participate in the on-call rotation for critical ML infrastructure and lead incident response and post-mortems.

Requirements

  • 5+ years of experience in DevOps, SRE, Platform, or Infrastructure roles running production systems, ideally with high-performance or ML workloads.
  • Former or current Linux, systems, or network administrator comfortable debugging at the OS and network layers.
  • Deep familiarity with Linux as a daily driver, including shell scripting and cluster and service configuration.
  • Strong experience with workload management, containerization, and orchestration in production environments, including Slurm, Docker, and Kubernetes.
  • Solid understanding of CI/CD tools and workflows such as GitHub Actions, GitLab CI, or Jenkins, including building pipelines from scratch.
  • Hands-on cloud infrastructure experience with AWS, GCP, or Azure, especially GPU instances, VPC/networking, storage, and managed ML services.
  • Proficiency with infrastructure as code tools such as Terraform or CloudFormation, with a bias toward automation.
  • Experience with monitoring and logging stacks such as Grafana, Prometheus, Loki, or CloudWatch.
  • Familiarity with ML pipeline and experiment orchestration tools such as MLflow, Kubeflow, Airflow, or Metaflow, and model/version management.
  • Solid programming skills in Python and the ability to read and debug code using common ML libraries such as PyTorch or TensorFlow.
  • Strong ownership mindset, comfort with ambiguity, and enthusiasm for scaling and hardening critical infrastructure in an ML-heavy environment.
  • Willingness to learn.

Benefits

  • Intellectually stimulating work environment.
  • Opportunity to work with real-time data processing and AI.
  • Work at a leading AI startup with strong career prospects.
  • Distributed team with members across the world.
  • Opportunity to make a significant contribution to the company’s success.
  • Inclusive workplace culture.
  • Permanent employment contract.
  • Remote work, with the option to meet team members in Palo Alto, Paris, or Wroclaw.
  • Candidates based anywhere in the EU, United States, and Canada will be considered.
  • Compensation based on profile and location.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

DevOps Engineer (AWS/Tencent/Ali Cloud)

Binance 5K-10K Capital Markets

Binance is hiring a remote DevOps Engineer to support its Big Data team by operating production data infrastructure and improving reliability, deployment, and observability across global cloud environments.

Ansible AWS CI/CD GitHub Actions Go Java Kafka Kubernetes Python Redis SQL Terraform
2 hours, 11 minutes ago

AWS DevOps Engineer (Senior)

Mactores 51-250 IT Services

Mactores is hiring a Senior AWS DevOps Engineer in Mumbai/remote to support cloud and data platform delivery by automating infrastructure, releases, and operations for customer projects.

Agile Ansible AWS Azure Bitbucket Chef CI/CD Docker GCP Git Grafana Jenkins Kubernetes Linux Machine Learning Terraform
2 hours, 26 minutes ago

Senior Machine Learning Engineer

airSlate 251-1K Professional Services

airSlate is seeking a Senior Machine Learning Engineer to develop and deploy ML and AI solutions that support high-impact marketing, SEO, and customer value initiatives at global scale.

AWS BERT Deep Learning Feature Engineering GPT LLM Machine Learning Python Reinforcement Learning SageMaker SEO
2 hours, 26 minutes ago

DevOps Engineer

Lucidworks 51-250 Internet Software & Services

Lucidworks is hiring a remote DevOps Engineer in India to support its cloud platform for managed Fusion customers by building and operating automation-driven infrastructure at scale for mission-critical search applications.

Agile Argo CD GCP GitHub Actions Go Grafana Helm Kubernetes Prometheus Python Solr Terraform
2 hours, 41 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers