Later

Later

Later is a top social media management and influencer platform that simplifies visual content marketing for Instagram, Facebook, Twitter, and Pinterest. With over 2 million users globally, including renowned brands like Yelp and The Huffington Post, La...

Media
51-250
Founded 2014

Description

  • Define and own the long-term ML infrastructure roadmap for experimentation and future AI initiatives.
  • Establish best practices for model lifecycle management, deployment standards, monitoring, and governance.
  • Design scalable solutions to fill infrastructure gaps and support faster ML development.
  • Build and maintain production-grade model deployment and inference systems using CI/CD, Docker, and APIs.
  • Automate ML workflows for training, validation, registry management, deployment, and rollback.
  • Implement monitoring for model performance, latency, drift, and infrastructure health.
  • Operate ML workloads across AWS and GCP, including GPU-based infrastructure and BigQuery datasets.
  • Develop and maintain infrastructure-as-code to create scalable, repeatable, and secure cloud environments.
  • Optimize CI/CD workflows for ML and infrastructure automation.
  • Partner with data scientists, analysts, platform engineers, and product engineers to translate experimentation needs into production-ready systems.

Requirements

  • 4+ years of experience in ML Ops, ML infrastructure, backend engineering, or a related role supporting production ML systems.
  • Experience working in cloud-native environments with AWS and/or GCP.
  • Proven experience designing and implementing CI/CD pipelines for ML systems.
  • Strong experience with Amazon SageMaker, Docker, Flask-based APIs, and infrastructure automation tools.
  • Hands-on experience with ML lifecycle tooling such as MLflow, SageMaker Studio, or Weights & Biases.
  • Experience managing container orchestration platforms such as Kubernetes, EKS, or GKE.
  • Strong programming experience in Python; additional experience in Go, Java, or Scala is a plus.
  • Experience with infrastructure-as-code tools such as Terraform or CloudFormation.
  • Familiarity with observability tools such as CloudWatch, Prometheus, Grafana, Datadog, or centralized logging platforms.
  • Experience managing GPU-based workloads and scaling training and inference systems.
  • Familiarity with data infrastructure tools such as BigQuery and cloud-native data pipelines.
  • Bonus: experience supporting LLMs or generative AI pipelines, distributed training systems, feature stores like Feast, real-time inference systems, or ML governance frameworks.
  • A mindset focused on automation, reliability, performance, and continuous improvement in fast-scaling environments.

Benefits

  • Salary range of $145,000 to $165,000.
  • Market-based, data-driven compensation approach with biannual review.
  • Permanent team members are eligible for a broader benefits package.
  • Fully remote option available for select positions.
  • Offices in Boston, Vancouver (BC), Chicago, and Vancouver (WA).
  • Inclusive, equal opportunity workplace with accommodations available during the recruitment process.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Machine Learning Engineer I, Personalization , Minesweeper

Spotify Media

Spotify is hiring a Machine Learning Engineer I for its Personalization team to build and improve content-enrichment systems that understand music, podcasts, and audiobooks for recommendations and listening experiences.

Agile Apache Spark AWS GCP Java LLM Machine Learning Python PyTorch Scala SQL TensorFlow
4 hours ago

Sagemaker DevOps Engineer - Europe

Xenon7 Internet Software & Services

Xenon7 is hiring a remote Sagemaker DevOps Engineer in Europe to build and automate enterprise-scale ML infrastructure and deployment workflows for clients across cutting-edge IT projects.

AWS CI/CD Docker Jenkins MLOps Python
6 hours, 10 minutes ago

Senior Machine Learning Infrastructure Engineer

Unity 5K-10K Internet Software & Services

Unity is hiring a Senior Machine Learning Infrastructure Engineer to build and operate real-time ML serving infrastructure for its global advertising platform, helping production ranking, bidding, and targeting systems run at scale.

Go Grafana Kubernetes OpenTelemetry Prometheus Python Terraform
6 hours, 12 minutes ago

Machine Learning Systems Engineer

Motional 1K-5K Automotive

Motional is hiring a Machine Learning Systems Engineer for its ML Acceleration team to improve large-scale model training systems for speed, cost, reliability, and throughput.

Machine Learning Python PyTorch
6 hours, 45 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers