Headout

Headout is an on-demand mobile marketplace that offers travelers access to the best tours, attractions, events, and local experiences at discounted prices. With a focus on curated experiences, Headout provides a one-stop solution for discovering and bo...

Consumer Services

Consumer Discretionary

251-1K (510)

Founded 2015

$66M raised

11 open positions

Links

View All Jobs

Senior Site Reliability Engineer

1 month, 2 weeks ago

India

Full-time

Senior

Site Reliability Engineer (SRE)

DevOps and Infrastructure

AWS Azure CI/CD Datadog GCP GitHub GitHub Actions GitLab Go Grafana Jaeger Java Jenkins Kotlin Kubernetes Microservices MongoDB MySQL New Relic Prometheus Pulumi Python Shell Scripting Terraform

Apply Now

Headout

Consumer Services

251-1K

Founded 2015

$66M raised

View All Jobs 11

Description

Manage and optimize Kubernetes clusters and their workloads across cloud infrastructure.
Build and maintain CI/CD pipelines and reusable workflows, including canary release processes.
Design service-level dashboards, fine-tune alerts, and manage incidents across the organization.
Improve application performance by rolling out backend changes that boost API and page performance, database efficiency, and bottleneck resolution.
Architect and build scalable platform tools for cross-pod use cases.
Develop tools and workflows that improve developer velocity and engineering efficiency.
Build guardrails for security practices and help standardize them across the organization.
Collaborate with and mentor junior engineers, drive root-cause analyses, and promote best practices.
Work across DevOps, observability, application performance, and related platform areas.

Requirements

4-7 years of experience operating customer-facing services at scale.
Proficiency in operating, debugging, and optimizing Kubernetes clusters and workloads.
Experience with service mesh and tracing tools such as Istio and Jaeger.
Comfort working with any cloud provider, preferably AWS.
Hands-on experience with monitoring and alerting stacks such as Prometheus, Grafana, Thanos, New Relic, or Datadog.
Experience designing robust CI/CD workflows in tools such as GitHub, GitLab, or Jenkins.
Proficiency with infrastructure as code using Terraform or Pulumi.
Fluency in Python, Go, or Java/Kotlin, plus shell scripting.
Experience working with databases such as MySQL or MongoDB.
Ability to profile applications, database queries, and traces.
Understanding of security best practices and compliance requirements.
High agency and a proactive approach to identifying and fixing issues.
Interest in travel, local experiences, and hospitality is a bonus.
Experience in a rapidly growing startup is a bonus.
Anything out of the box that can surprise the team is a bonus.

Benefits

Work at a profitable, fast-growing company with $130M in revenue and guests in 100+ cities.
Opportunity to influence architecture decisions and the evolution of the stack.
High-impact work that improves deployment turnaround time and p99 performance metrics.
Flexibility to work across different stacks, tools, and platforms.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff Operations Engineer

Mozilla 251-1K Internet Software & Services

Mozilla is hiring a Staff Operations Engineer to lead the design, reliability, and evolution of hybrid-cloud and workplace infrastructure across teams.

Canada Full-time Lead Infrastructure Engineer Site Reliability Engineer (SRE)

$86k-$127k

Ansible DNS Linux Puppet Python TCP/IP Unix

4 hours, 38 minutes ago

Apply

4 hours, 38 minutes ago

Principal Site Reliability Engineer (SRE)

Symmetrio Professional Services

Symmetrio is recruiting a Principal Site Reliability Engineer for a rapidly growing healthcare technology company to own the reliability, scalability, security, and performance of a mission-critical SaaS platform used by healthcare providers across the United States.

United States Full-time Lead Site Reliability Engineer (SRE)

Active Directory AWS CI/CD Datadog Django Grafana Kubernetes Python Terraform Windows Server

4 hours, 53 minutes ago

Apply

4 hours, 53 minutes ago

Performance Test Engineer Lead

PartnerOne 51-250 Media

An enterprise performance engineering role at a cloud-focused organization, responsible for validating the scalability, stability, and production readiness of distributed systems across Azure and hybrid environments.

Egypt Full-time Lead QA Engineer Site Reliability Engineer (SRE)

Azure CI/CD Kubernetes PowerShell

5 hours, 8 minutes ago

Apply

5 hours, 8 minutes ago

Site Reliability Engineer

MLabs 11-50 Internet Software & Services

Remote UK-hours Site Reliability Engineering role at a financial technology company, focused on automating and operating the infrastructure that supports global integration services for financial institutions.

Netherlands Germany Poland United Kingdom Full-time Mid Level Site Reliability Engineer (SRE)

$121k-$148k

Active Directory Ansible AWS CI/CD GCP OAuth PostgreSQL SAML

5 hours, 23 minutes ago

Apply

5 hours, 23 minutes ago

Headout

Tags

Links

Senior Site Reliability Engineer

Headout

Description

Requirements

Benefits

Similar Roles

Staff Operations Engineer

Principal Site Reliability Engineer (SRE)

Performance Test Engineer Lead

Site Reliability Engineer

You're on a roll! Sign up now to keep applying.