Systems Engineer, HPC (US & Canada)

10 hours, 53 minutes ago
Full-time
Senior
DevOps and Infrastructure

Mistral AI

Mistral AI is a French AI company that builds frontier AI models, assistants, agents, and services for consumers and enterprises. Its mission is to make frontier AI accessible to everyone and to democratize AI through open-source, efficient, and innovative models, products, and solutions.

Artificial Intelligence
201-500
Founded 2023

Description

  • Operate and maintain large-scale Linux environments across bare metal, clusters, and cloud infrastructure.
  • Monitor system health, troubleshoot incidents, and support high availability.
  • Support production and research workloads across multiple environments.
  • Help scale clusters from hundreds to thousands of nodes.
  • Work on systems handling petabyte-scale storage and improve performance, reliability, and resource utilisation.
  • Automate operational tasks using tools such as Python, Bash, Ansible, or Terraform.
  • Improve deployment, provisioning, and system lifecycle management.
  • Contribute to system design and architecture decisions.
  • Collaborate closely with HPC, infrastructure, platform, DevOps, and research teams.
  • Act as a bridge between users and infrastructure.

Requirements

  • Strong Linux systems administration experience is required.
  • Experience working in large-scale environments such as HPC clusters or cloud infrastructure.
  • Experience with job schedulers such as Slurm.
  • Solid troubleshooting skills across systems, hardware, and networks.
  • Experience with containers or orchestration such as Kubernetes is preferred.
  • Experience with storage systems such as Ceph, Lustre, or NFS is preferred.
  • Networking fundamentals, including Ethernet, are required; InfiniBand experience is a plus.
  • Experience with infrastructure as code or automation tooling is preferred.
  • GPU or AI/ML experience is a plus.
  • Comfort working in a fast-scaling, collaborative, hands-on environment.

Benefits

  • Competitive compensation.
  • Benefits package.
  • Remote work flexibility.
  • Opportunity to contribute to cutting-edge, high-impact AI infrastructure.
  • Chance to help shape data centre operations in a high-growth startup environment.
  • Work with a talented, cross-functional team.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

IT Systems Engineer

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is hiring a Systems Engineer on its IT Systems team to automate internal support processes and build self-service tools that improve employee productivity and service desk efficiency.

AWS CDK Go LLM Python REST API Rust Terraform
10 hours, 23 minutes ago

Senior Systems Engineer

Eastern Communications 11-50 Diversified Telecommunication Services

Eastern Communications is seeking a Senior Systems Engineer in Long Island City, New York to support land mobile radio communications systems for public safety, transportation, and utility customers across the full project lifecycle.

Linux Network Security System Design Windows Server
10 hours, 23 minutes ago

Computer Systems Engineer - I (Computer Network Architect)

Barbaricum 251-1K Professional Services

Barbaricum is hiring a Computer Systems Engineer I to support the design, implementation, and maintenance of DoD cyber range and computer test bed environments that enable cybersecurity training, testing, and mission operations.

Active Directory Ansible Bash Chef Cybersecurity DNS IoT Linux Python Windows Server
1 day, 10 hours ago

Senior AI Workflow & Systems Engineer

TubeScience 51-250 Media

TubeScience is hiring a Senior AI Workflow & Systems Engineer to own the infrastructure, deployment, and support systems behind company-wide AI initiatives.

AWS CI/CD GCP Generative AI JavaScript LLM Node.js Python REST API Secrets Management Vercel
1 day, 10 hours ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers