Veeam Software

Veeam Software

Veeam Software is the global leader in Backup that delivers Modern Data Protection, offering solutions for virtual environments, enterprises, small businesses, and service providers worldwide.

Internet Software & Services
1K-5K
Founded 2006
$500M raised

Description

  • Own the end-to-end operationalization of ML/AI solutions, taking models from development to scalable, reliable production systems.
  • Design, automate, and maintain CI/CD pipelines for model training, testing, deployment, and retraining (Azure DevOps, Databricks).
  • Build, optimize, and version model lifecycle workflows to ensure reproducibility, lineage, and governance across the ML/AI platform.
  • Monitor production models for performance, drift, reliability, and resource usage, and implement automated retraining and corrective workflows.
  • Optimize compute, storage, and orchestration on the Databricks platform to ensure efficient and cost-effective operations.
  • Design, consume, and integrate REST APIs to expose ML/AI models as services for real-time or near-real-time inference.
  • Collaborate closely with ML/AI Scientists, Data Engineers, Data Warehouse teams, and product stakeholders to transform research-grade models into production-ready services and integrate intelligence into tools such as Copilot, Salesforce, and Tableau.
  • Contribute to evolving the ML/AI platform, tooling, automation standards, and best practices across the organization.

Requirements

  • 7+ years of experience in operationalizing ML/AI models including deployment, automation, monitoring, and lifecycle management.
  • Strong programming skills in Python, PySpark, and SQL with ability to produce production-ready code.
  • Experience with feature engineering and data engineering fundamentals, including designing, validating, and optimizing feature pipelines and ensuring feature consistency.
  • Experience building vector embeddings and retrieval-augmented generation (RAG) systems.
  • Familiarity with ML and LLM model development and common ML libraries and frameworks.
  • Experience with MLflow or similar tools for model tracking, registry management, and lifecycle operations.
  • Familiarity with CI/CD pipelines (Azure DevOps preferred) and experience designing deployment pipelines for models.
  • Strong understanding of data versioning, model versioning, reproducibility, and data lineage in governed ML/AI environments.
  • Experience designing, consuming, or integrating REST APIs to support real-time or near-real-time inference and production serving.
  • Bonus/optional: familiarity with data quality frameworks, infrastructure-as-code, Unix/DevOps principles, Docker/Kubernetes, real-time/streaming serving architectures, AI agent tools/MCP servers, and a strong interest in platform evolution and tooling.

Benefits

  • Two weeks paid vacation, 12 statutory holidays, plus 4 global VeeaMe Days and 24 paid volunteer hours annually through Veeam Cares.
  • Paid parental leave: 8 days for fathers, 122 days for birthing parents, and 92 days for adoptive parents.
  • Medical, dental, and vision coverage fully funded through INS Premium for employees and dependents.
  • Mental health support, therapy sessions, and virtual care via the Employee Assistance Program.
  • Retirement and social security contributions through Costa Rica’s statutory programs.
  • Life insurance equal to 24x monthly salary, plus disability and funeral coverage.
  • Daily cafeteria subsidy.
  • Learning and development resources including LinkedIn Learning, O’Reilly, mentoring, workshops, and annual Global Day of Learning.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Machine Learning Engineer (Remote, Full-Time) [AS207]

Smart Working Internet Software & Services

Machine Learning Engineer at Smart Working (remote) responsible for architecting, building, and maintaining production-grade ML systems to power ranking, recommendation, and forecasting and to bridge experimentation into reliable, scalable production services.

AWS dbt Machine Learning MLOps Python Snowflake SQL
1 hour, 45 minutes ago

Senior AI-Enabled DevOps Engineer

PointClickCare 1K-5K Health Care Providers & Services

Senior DevOps Engineer at PointClickCare (Remote, USA) responsible for designing, building, and operating scalable cloud infrastructure and developer platforms that support application and AI-driven workloads while improving reliability and developer velocity.

Argo CD AWS Azure Bash Datadog DevSecOps Docker GCP Git GitHub Actions GitLab CI GitOps Go Grafana Jenkins Kubernetes Microservices Prometheus Python Scrum Terraform
2 hours, 30 minutes ago

Data & AI Operations Specialist

ZainTech 51-250 Internet Software & Services

Data & Operations AI Specialist acting as the Level 3 technical lead to architect, operate, and troubleshoot AI infrastructure, data pipelines, and MLOps lifecycles across a multi-cloud environment to deliver secure, reliable, and cost‑effective production ML services.

Azure CI/CD Databricks Grafana MLOps SIEM Terraform
11 hours, 15 minutes ago

AI/ML Engineer II

Precision Medicine Group 251-1K Pharmaceuticals

Precision AQ is seeking an AI/ML Engineer in India to design, develop, deploy, and scale production-grade AI/ML solutions that support oncology access, analytics, and AI-enabled productized services.

AWS Azure Deep Learning Docker Feature Engineering GCP Generative AI Kubernetes Machine Learning MLOps Python SQL
11 hours, 30 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers