Engineering Manager, Infrastructure Platforms

3 hours, 59 minutes ago
Full-time
Mid Level
Software Development
GitLab

GitLab

GitLab: The comprehensive DevOps platform revolutionizing software development with automation, AI workflows, and essential tools for efficient collaboration.

Internet Software & Services
1K-5K
Founded 2014

Description

  • Hire and manage a high-performing Site Reliability Engineering team in India, including coaching, regular 1:1s, and career development.
  • Design, coordinate, and continuously refine the team’s shift and weekend coverage model for Dedicated cutovers across EMEA and US hours.
  • Own operational execution of Dedicated Geo migrations and cutovers, including planning, pre-cutover preparation, live execution, and post-cutover validation and cleanup.
  • Ensure timely, high-quality responses to Geo-related escalations from Support and internal partners.
  • Make and drive technical decisions for the team, stepping in as final decision-maker during high-stakes migrations or incidents.
  • Build and maintain runbooks, guardrails, and post-cutover reviews to enable rigorous, repeatable operations.
  • Collaborate with core Geo, Dedicated migrations, and other Infrastructure teams to identify and prioritize engineering investments that improve migration tooling and processes.
  • Define, track, and report key operational metrics (e.g., escalation volume absorbed, internal escalation rate, cutover coverage, response times, team health) and use them to drive continuous improvement.
  • Participate in the Incident Management on-call rotation to help meet availability and reliability goals for GitLab.com.

Requirements

  • 3+ years managing SRE, infrastructure, or platform engineering teams operating highly-available distributed systems at scale, ideally in a SaaS environment with customer-facing SLAs.
  • Proven ability to lead in a remote, high-performance environment, collaborating across multiple time zones and cultures.
  • Experience running or significantly contributing to large-scale data migrations where customer data integrity and downtime risk are carefully managed.
  • Strong infrastructure background including cloud platforms, observability, incident response, and distributed multi-tenant architectures.
  • Excellent communication and interpersonal skills with the ability to translate technical concepts and risk trade-offs for technical and non-technical stakeholders, including customers.
  • Strong problem-solving skills, attention to detail, and a focus on delivering high-quality, low-risk operational outcomes.
  • Willingness and availability to coordinate shift and weekend coverage and participate in on-call Incident Management rotations.
  • Experience with managed/hosted or regulated/compliance-sensitive environments (e.g., SOC2, ISO) (preferred).
  • Working knowledge of SRE and migration technologies such as Kubernetes, Terraform, observability stacks, and scripting languages (preferred).
  • Experience using GitLab, contributing to open source, or working in enterprise developer tools or high-growth infrastructure product companies (preferred).

Benefits

  • Benefits to support your health, finances, and well-being
  • Flexible Paid Time Off
  • Team Member Resource Groups for inclusion and belonging
  • Equity compensation and Employee Stock Purchase Plan
  • Growth and Development Fund for learning and career development
  • Parental leave
  • Home office support
  • Remote-first hiring and global remote work eligibility

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Engineering Manager- Risk Platform

Aledade 1K-5K Health Care Providers & Services

Aledade is seeking a Senior Engineering Manager to lead its Risk Platform team in building and operating the core risk and diagnosis systems that support patient risk assessment for Medicare and commercial contracts.

Agile Angular AWS Azure C# C++ CSS Docker GCP Go HIPAA HTML Java Kanban Kubernetes LLM Microservices Node.js Python React Scala Scrum Vue.js
3 hours, 22 minutes ago

Engineering Manager, Listings Product

Airbnb 5K-10K Hotels, Restaurants & Leisure

Airbnb is hiring a Listings Product engineering leader to shape the Host Listing Management experience and the underlying marketplace systems that help guests book with confidence.

Prototyping
3 hours, 22 minutes ago

Software Engineering Team Lead

Quest Analytics 251-1K Professional Services

Quest Analytics is seeking a remote Software Engineering Team Lead to guide the technical direction of its healthcare SaaS platforms and help scale high-performing, AI-enabled products.

Agile AWS Azure C# CI/CD CSS Databricks Entity Framework HTML .NET React REST API Snowflake SQL Server
3 hours, 22 minutes ago

Senior Engineering Manager, Member AI Features

SoFi 1K-5K Capital Markets

SoFi is seeking a Senior Engineering Manager to lead its new Member AI Features team, owning team growth, technical direction, and delivery of scalable financial product experiences.

Agile AWS CI/CD LLM
3 hours, 37 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers