Site Reliability Engineer (Senior or Staff), Atlas

1 hour, 10 minutes ago
Full-time
Senior
DevOps and Infrastructure
MongoDB

MongoDB

MongoDB provides a developer data platform that simplifies data management and accelerates application development, enabling businesses to leverage modern database technology for innovative solutions across various industries.

Internet Software & Services
1K-5K
Founded 2007

Description

  • Design and build complex systems for the Atlas platform.
  • Support, maintain, and grow the Atlas platform with an ownership mindset.
  • Work closely with Atlas software engineering teams to run systems at scale.
  • Build new tooling and automation to improve reliability and operational efficiency.
  • Perform essential maintenance across the Atlas fleet.
  • Develop and contribute to a reliable, resilient multi-cloud platform for business-critical applications.
  • Collaborate with service-owning teams to solve technical challenges and adapt tooling for novel use cases.
  • Participate in a 24/7 on-call rotation to respond quickly to Atlas incidents and minimize customer impact.

Requirements

  • 5+ years of experience running critical systems at scale.
  • Experience with at least one major cloud provider such as AWS, Azure, or GCP.
  • Ability to build and operate systems in a multi-cloud environment.
  • Strong understanding of running large-scale Linux environments, including low-level fundamentals.
  • Firm grasp of at least one modern programming language such as Go, Ruby, or Python.
  • Solid understanding of web and network protocols and standards such as HTTP, TLS, and DNS.
  • US citizenship is required.
  • Experience with infrastructure or SRE work in customer-facing environments.
  • Preference for automation and process efficiency over manual operational work.
  • Experience operating with autonomy and ownership in complex systems environments.

Benefits

  • Base salary range of $127,000 to $249,000 USD for U.S.-based candidates.
  • Equity as part of total compensation for eligible employees.
  • Employee stock purchase program.
  • Flexible paid time off.
  • 20 weeks of fully paid gender-neutral parental leave.
  • Fertility and adoption assistance.
  • 401(k) plan.
  • Mental health counseling and transgender-inclusive health insurance coverage.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Manager, Software Engineering (Resilience Engineering)

Affirm 1K-5K Diversified Financial Services

Affirm is seeking an Engineering Manager to lead its Resilience Engineering team, building production load testing and chaos engineering capabilities that improve the safety and reliability of production systems.

AWS Java Kotlin Kubernetes Microservices Python
1 hour, 19 minutes ago

Manager, Software Engineering (Resilience Engineering)

Affirm 1K-5K Diversified Financial Services

Affirm is hiring an Engineering Manager to lead its Resilience Engineering team in building production load testing and chaos engineering capabilities that improve the safety and reliability of its production systems.

AWS Java Kotlin Kubernetes Python
4 hours, 23 minutes ago

Senior Site Reliability Engineer

Civica 1K-5K Internet Software & Services

Civica is hiring a Senior Site Reliability Engineer to own the reliability, performance, security, and automation of the cloud platform supporting its public-sector SaaS products.

Ansible AWS Azure CI/CD CloudFormation Datadog ELK Stack GCP GitHub Actions Go Grafana Jaeger Java Kubernetes .NET OpenSearch OpenShift Packer Prometheus Python Terraform
16 hours, 8 minutes ago

Site Reliability Engineer

Sitetracker 251-1K Diversified Telecommunication Services

Site Reliability Engineer at a Canada-based technology company, responsible for building and scaling a proactive reliability practice for AI-driven platform workloads in a remote environment.

AWS Bash CloudFormation EC2 GitHub Actions Load Balancing Terraform
16 hours, 8 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers