Zyte

Zyte is a leading provider of Full Stack Web Scraping API and World Class Data Extraction Services. With AI-powered web scraping platform, Zyte offers fast and reliable data extraction solutions for over 2,000 companies and 1 million developers worldwi...

Professional Services

Industrials

251-1K (300)

Founded 2010

$3M raised

11 open positions

Links

View All Jobs

Core & ML Ops Team Lead - Remote

3 weeks, 1 day ago

Brazil, Spain, Hungary, Poland, Portugal

Full-time

Lead

Machine Learning Engineer

DevOps and Infrastructure

Apache Airflow C++ CI/CD Docker Go Java Kafka Kubernetes Linux Mesos Microservices Python REST API Rust TCP/IP

Apply Now

Zyte

Professional Services

251-1K

Founded 2010

$3M raised

View All Jobs 11

Description

Design and evolve the core platform infrastructure (container orchestration, GPU scheduling/autoscaling, and distributed compute).
Own the model platform including registry, experiment tracking, training orchestration, evaluation, serving, and monitoring.
Build and maintain the Golden Path: reference repositories, scaffold CLI, opinionated CI/CD pipelines, runtime contracts (health/metrics/tracing/SLOs), and production-ready defaults.
Operate a secure, multi-tenant model registry and training platform with standardized experiment/evaluation harnesses.
Provide turnkey serving patterns for online and batch inference, including drift/quality monitoring and rollback playbooks.
Integrate public and open-source AI capabilities as managed platform services with cost and data-governance guardrails.
Run the squad: set roadmap and priorities, drive delivery, mentor engineers, and uphold high engineering standards and platform-thinking.
Partner with product engineering, Prod Ops, and Security on adoption, rollout plans, observability, billing/cost tracking, and supply-chain security.

Requirements

5+ years experience building distributed systems and 3+ years in MLOps/ML platform engineering (or equivalent impact).
Proven track record designing and operating model platforms in production (registry, training, serving, monitoring).
Deep understanding of Linux/OS internals (process model, cgroups/namespaces), networking (TCP/IP, HTTP/2), concurrency, and performance profiling.
Strong knowledge of Kubernetes (Mesos experience a bonus) and GPU infrastructure provisioning, scheduling, containerization, and optimization.
Proficiency developing high-performance services in Java, Rust, Go, or C++ (bonus: vert.x/Netty); strong Python skills.
Demonstrated success leading technical teams and implementing organization-wide platform solutions.
Experience with observability and reliability practices (logging/metrics/tracing pipelines, SLIs/SLOs, incident management) and cost governance for ML/AI workloads.
Familiarity with streaming and workflow tools (Kafka, Argo, Temporal, Airflow) and experience with multi-tenant quotas/fairness.
Hands-on experience authoring Golden Paths (service templates, CI/CD blueprints, CLI scaffolds) and supply-chain security practices (SBOM, image signing).
Preferred experience with eBPF-based observability, perf tooling or io_uring, and cost-optimization strategies for ML workloads.

Benefits

Completely remote company with the freedom and flexibility to work from where you do your best work.
Be part of a self-motivated, progressive, multi-cultural team that fosters and nourishes new ideas.
Opportunity to work with cutting-edge open-source technologies and tools.
Support for bringing new ideas to market and a culture focused on innovation.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Machine Learning Engineer, AI Platform

Affinity 251-1K IT Services

Affinity is hiring a Senior Machine Learning Engineer for its AI Platform team to build production ML systems that extract, retrieve, and rank insights from massive relationship and business interaction data for its CRM platform.

United States Full-time Senior Machine Learning Engineer

$160k-$235k

CI/CD Feature Engineering Machine Learning Python PyTorch Scikit-learn

1 hour, 54 minutes ago

Apply

1 hour, 54 minutes ago

AI Tech Lead - Staff Machine Learning Engineer

Sumo Logic 251-1K Internet Software & Services

Sumo Logic is hiring a Staff Machine Learning Engineer – AI Tech Lead to lead the design and production delivery of agentic AI systems for Security Operations Center use cases at global scale.

United States Full-time Lead AI Engineer Machine Learning Engineer

Apache Airflow AWS Azure Docker GCP Kubernetes LLM Machine Learning MLflow Python PyTorch System Design Vertex AI

2 hours, 14 minutes ago

Apply

2 hours, 14 minutes ago

AI/ML Engineer (AWS)

Reply 10K-50K Internet Software & Services

Valorem Reply is hiring a Senior AI/ML Engineer in Irvine or Los Angeles to build and evolve AWS-based machine learning and Generative AI applications for enterprise customers.

United States Full-time Mid Level Machine Learning Engineer

$120k-$155k

Agile AWS CI/CD Generative AI LLM Machine Learning Python

4 hours, 15 minutes ago

Apply

4 hours, 15 minutes ago

Machine Learning Principal Solutions Architect

phData 251-1K IT Services

phData is hiring a Principal Solutions Architect to lead delivery of AI/ML solutions for enterprise clients while also driving strategic account growth and client engagement.

United States Full-time Lead Machine Learning Engineer Solutions Engineer

AWS Azure Databricks dbt Django Docker Flask GCP Java Keras Kubernetes Machine Learning MLflow Python SageMaker Scala Scikit-learn Snowflake Spring TensorFlow Vertex AI

4 hours, 48 minutes ago

Apply

4 hours, 48 minutes ago

Zyte

Tags

Links

Core & ML Ops Team Lead - Remote

Zyte

Description

Requirements

Benefits

Similar Roles

Senior Machine Learning Engineer, AI Platform

AI Tech Lead - Staff Machine Learning Engineer

AI/ML Engineer (AWS)

Machine Learning Principal Solutions Architect

You're on a roll! Sign up now to keep applying.