Mistral AI

Mistral AI is a French AI company that builds frontier AI models, assistants, agents, and services for consumers and enterprises. Its mission is to make frontier AI accessible to everyone and to democratize AI through open-source, efficient, and innovative models, products, and solutions.

Artificial Intelligence

Technology

201-500 (500)

Founded 2023

8 open positions

Links

View All Jobs

Mistral Cloud - Site Reliability Engineer

2 months ago

France, Spain, United Kingdom, Belgium, Germany, Italy, Netherlands

Full-time

Senior

Site Reliability Engineer (SRE)

DevOps and Infrastructure

Bash CI/CD CloudFormation Datadog Docker ELK Stack Go Grafana Kubernetes Prometheus Python Terraform

Apply Now

Mistral AI

Artificial Intelligence

201-500

Founded 2023

View All Jobs 8

Description

Design, build, and maintain scalable, highly available, and fault-tolerant infrastructure.
Operate production systems and troubleshoot incidents, interruptions, user issues, and infrastructure scaling needs.
Implement and improve monitoring, alerting, and incident response systems to reduce downtime.
Build and maintain CI/CD, containerization, orchestration, logging, and observability workflows for APIs and training runs.
Participate in on-call rotations and perform root cause analysis for incidents.
Drive automation, deployment, and orchestration improvements across the infrastructure stack.
Collaborate with software engineers on safe, reproducible model-training experiments and platform abstractions.
Develop new tooling, automation scripts, APIs, dashboards, and web apps to improve reliability and performance.
Work with the security team to ensure infrastructure meets security and compliance requirements.
Document processes and contribute to open-source projects, publications, blogs, and conferences.

Requirements

Master’s degree in Computer Science, Engineering, or a related field.
5+ years of experience in a DevOps or Site Reliability Engineering role.
Strong experience with bare metal infrastructure and highly available distributed systems.
Experience handling reliability issues in critical environments, including root cause analysis and in-production troubleshooting.
Experience working against reliability KPIs such as observability, alerting, and SLAs.
Hands-on experience with CI/CD, containerization, and orchestration tools such as Docker and Kubernetes.
Knowledge of monitoring, logging, alerting, and observability tools such as Prometheus, Grafana, ELK Stack, or Datadog.
Familiarity with infrastructure-as-code tools such as Terraform or CloudFormation.
Proficiency in scripting languages such as Python, Go, or Bash, with knowledge of software development best practices.
Strong understanding of networking, security, and system administration concepts.
Experience in an AI/ML environment is preferred.
Experience with high-performance computing systems and workload managers such as Slurm is preferred.
Experience with modern AI-oriented infrastructure solutions such as Fluidstack, Coreweave, or Vast is preferred.

Benefits

Competitive salary and equity.
Health insurance.
Transportation allowance.
Sport allowance.
Meal vouchers.
Private pension plan.
Generous parental leave policy.
Visa sponsorship.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer (Top Secret Clearance)

SpaceX 10K-50K Aerospace & Defense

SpaceX is hiring a Site Reliability Engineer to support Classified IT Systems Engineering by building and operating scalable infrastructure for high-volume data products and GPU-accelerated machine learning workloads.

United States Full-time Junior Site Reliability Engineer (SRE)

$145k-$175k

Bash Kubernetes Linux Python

7 hours, 49 minutes ago

Apply

7 hours, 49 minutes ago

Junior Site Reliability Engineer

Fable 11-50 Professional Services

Fable is hiring a Junior Site Reliability Engineer to support the reliability, performance, and scalability of the infrastructure behind its accessible digital products.

Canada Full-time Junior Site Reliability Engineer (SRE)

$69k-$90k

AWS Azure Bash CI/CD CloudFormation Datadog GCP Git GitHub Actions Grafana JavaScript Linux Prometheus Python Terraform Unix

7 hours, 49 minutes ago

Apply

7 hours, 49 minutes ago

Senior SRE - Platform (Managed Kubernetes Infrastructure)

Elastic 1K-5K Internet Software & Services

Elastic is hiring a Site Reliability Engineer on its Platform Engineering team to design and operate the multi-cloud platform that hosts Elastic Cloud services and supports rapid, reliable product delivery.

Canada Full-time Senior Platform Engineer Site Reliability Engineer (SRE)

$120k-$150k

Docker Go InfluxDB Kubernetes Linux Prometheus Terraform

1 day, 7 hours ago

Apply

1 day, 7 hours ago

Site Reliability Engineer

Dropbox 1K-5K Internet Software & Services

Dropbox is hiring a Corporate Site Reliability Engineer to lead infrastructure reliability, observability, automation, and security for its IT Services environment.

Poland Full-time Senior Site Reliability Engineer (SRE)

$71k-$96k

Ansible AWS Bash Chef Datadog DHCP DNS Docker EC2 GitHub GitHub Actions GitOps Kubernetes Linux Python REST API Serverless Terraform Ubuntu WAF

1 day, 7 hours ago

Apply

1 day, 7 hours ago

Mistral AI

Tags

Links

Mistral Cloud - Site Reliability Engineer

Mistral AI

Description

Requirements

Benefits

Similar Roles

Site Reliability Engineer (Top Secret Clearance)

Junior Site Reliability Engineer

Senior SRE - Platform (Managed Kubernetes Infrastructure)

Site Reliability Engineer

You're on a roll! Sign up now to keep applying.