Later

Later

Later is a top social media management and influencer platform that simplifies visual content marketing for Instagram, Facebook, Twitter, and Pinterest. With over 2 million users globally, including renowned brands like Yelp and The Huffington Post, La...

Media
51-250
Founded 2014

Description

  • Define and own the long-term ML infrastructure roadmap for current experimentation and future AI initiatives.
  • Establish best practices for model lifecycle management, deployment standards, monitoring, and governance.
  • Identify infrastructure gaps and design scalable solutions that enable high-velocity ML development.
  • Contribute to cross-functional technical planning so ML systems align with product and platform strategy.
  • Design, build, and maintain production-grade model deployment and inference systems using CI/CD, Docker, and API frameworks.
  • Automate end-to-end ML lifecycle workflows, including training pipelines, model validation, registry management, deployment, and rollback.
  • Implement monitoring for model performance, latency, drift detection, and infrastructure health.
  • Operate ML workloads across AWS and GCP, including GPU-based infrastructure and BigQuery datasets.
  • Develop and maintain infrastructure-as-code to ensure scalable, repeatable, and secure cloud environments.
  • Partner with data scientists, analysts, platform engineers, and product engineers to translate experimentation needs into production-ready infrastructure.

Requirements

  • 4+ years of experience in MLOps, ML infrastructure, backend engineering, or related roles supporting production ML systems.
  • Experience working in cloud-native environments, especially AWS and/or GCP, with hands-on deployment of ML workloads.
  • Proven experience designing and implementing CI/CD pipelines for ML systems.
  • Strong experience with Amazon SageMaker, Docker, Flask-based APIs, and infrastructure automation tools.
  • Hands-on experience with ML lifecycle tooling such as MLflow, SageMaker Studio, or Weights & Biases.
  • Experience managing container orchestration platforms such as Kubernetes, EKS, or GKE.
  • Strong programming experience in Python; experience with Go, Java, or Scala is a plus.
  • Experience with infrastructure-as-code tools such as Terraform or CloudFormation.
  • Familiarity with observability tools such as CloudWatch, Prometheus, Grafana, Datadog, or centralized logging platforms.
  • Experience managing GPU-based workloads and scaling training/inference systems.
  • Familiarity with data infrastructure tools such as BigQuery and cloud-native data pipelines.
  • Bonus: experience supporting LLMs or generative AI pipelines, distributed training systems, feature stores such as Feast, real-time inference systems, or ML governance frameworks.

Benefits

  • Salary range of $145,000 to $165,000.
  • Market-based and data-driven compensation approach with biannual compensation reviews.
  • Permanent team members are eligible to participate in various benefits plans as part of their compensation package.
  • Flexible work location, with select roles open to fully remote candidates.
  • Office locations in Boston, Vancouver (BC), Chicago, and Vancouver (WA).
  • Equal opportunity employer with an inclusion-first approach and accommodations available during the recruitment process.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Sagemaker DevOps Engineer - Europe

Xenon7 Internet Software & Services

Xenon7 is hiring a remote Sagemaker DevOps Engineer in Europe to build and automate enterprise-scale ML infrastructure and deployment workflows for clients across cutting-edge IT projects.

AWS CI/CD Docker Jenkins MLOps Python
1 hour, 25 minutes ago

Machine Learning Engineer

Mindera 1K-5K Internet Software & Services

Mindera is seeking an ML Engineer to work with the ML Architect on machine learning frameworks and platform tooling that support scalable model development, deployment, and experimentation across business units.

Agile Apache Airflow CI/CD Docker Kubernetes MLflow MLOps PyTorch Scikit-learn TensorFlow
4 hours, 27 minutes ago

Senior Machine Learning Engineer, Advertiser Growth

Unity 5K-10K Internet Software & Services

Unity is hiring a senior software engineer on the Advertiser Growth team to build the systems that power ad marketplace scaling, financial integrity, and experimentation at massive scale.

Apache Spark Flink Generative AI Go Java Kafka LLM Scala
4 hours, 33 minutes ago

Senior Machine Learning Infrastructure Engineer

Unity 5K-10K Internet Software & Services

Unity is hiring a Senior Machine Learning Infrastructure Engineer for its Vector Ads team to build and operate the real-time infrastructure that powers ML-driven advertising at global, high-scale, low-latency performance.

Go Grafana Kubernetes Machine Learning OpenTelemetry Prometheus Python Terraform
6 hours, 2 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers