MongoDB

MongoDB

MongoDB provides a developer data platform that simplifies data management and accelerates application development, enabling businesses to leverage modern database technology for innovative solutions across various industries.

Internet Software & Services
1K-5K
Founded 2007

Description

  • Contribute to developing and maintaining a scalable, secure Kubernetes-based runtime environment that supports product needs across MongoDB.
  • Provide internal support for the Kubernetes ecosystem and partner with engineering teams to solve domain-specific problems.
  • Participate in a 24/7 on-call rotation to resolve critical issues.
  • Perform blameless post-mortems and drive systemic fixes to prevent repeat incidents.
  • Help manage the end-to-end lifecycle of the Kubernetes fleet and related reliability and security components.
  • Support the migration from Terraform-based infrastructure as code to an Operator-driven lifecycle management model.

Requirements

  • 6+ years of experience in software development and operating distributed systems.
  • Proficiency in Go, Python, or a similar language.
  • Strong code quality and testing practices, including unit, integration, and end-to-end tests.
  • Deep experience using and extending containerization technologies, preferably Kubernetes.
  • Solid understanding of Linux operating system internals and networking concepts such as filesystems, TCP/IP, DNS, and TLS.
  • Customer-focused mindset with internal developers treated as primary users.
  • Strong operational ownership and experience debugging complex production issues to resolution.
  • Preference for automation over manual processes.
  • Experience designing and implementing secure, multi-tenant runtime environments from first principles (preferred).
  • Experience with Kubernetes ecosystem tools such as Helm, Kustomize, Gatekeeper, Kyverno, CRDs/Operators, CRI, and CSI (preferred).
  • Experience with cloud infrastructure platforms such as AWS, GCP, or Azure (preferred).
  • Experience provisioning infrastructure with Terraform, Crossplane, or AWS Controllers for Kubernetes (ACK) (preferred).
  • Advanced Linux systems internals and container networking concepts such as namespaces and cgroups (preferred).

Benefits

  • Base salary range of $127,000 to $249,000 USD for U.S.-based candidates.
  • Equity as part of total compensation for eligible employees.
  • Employee stock purchase program.
  • Flexible paid time off.
  • 20 weeks of fully paid gender-neutral parental leave.
  • Fertility and adoption assistance.
  • 401(k) plan.
  • Mental health counseling and access to transgender-inclusive health insurance coverage.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Contract: Senior Site Reliability Engineer

Newsela 251-1K Diversified Consumer Services

Newsela is hiring a Senior Site Reliability Contractor to improve and automate infrastructure, monitoring, and release operations for its cloud-based education platform.

Agile AWS CI/CD Datadog Docker GCP GitHub Actions JIRA MySQL Neo4j PostgreSQL Prefect Python Redis SQL Terraform
0 minutes ago

Principal Site Reliability Engineer

Zscaler 1K-5K Internet Software & Services

Zscaler is hiring a Principal Site Reliability Engineer to join its Infrastructure Services and Architecture team, owning cloud and infrastructure reliability for customer-facing systems in a hybrid or remote role.

Agile Ansible CI/CD Git Go HashiCorp Vault Kubernetes Linux OpenID Connect Python Terraform
30 minutes ago

Senior Site Reliability Engineer

OfficeSpace Software 251-1K Internet Software & Services

OfficeSpace Software is hiring a Senior Site Reliability Engineer to own the performance, reliability, and cost efficiency of its production platform at scale while helping modernize operations with AI-assisted reliability engineering.

Ansible Apache Argo CD CI/CD Datadog GitOps Grafana Kubernetes Linux MariaDB Microservices MySQL Nginx PostgreSQL Prometheus Puppet Python Redis Ruby Ruby on Rails Sidekiq Terraform
2 hours, 15 minutes ago

Manager, Software Engineering (Resilience Engineering)

Affirm 1K-5K Diversified Financial Services

Affirm is hiring an Engineering Manager to lead its Resilience Engineering team in building production load testing and chaos engineering capabilities that improve the safety and reliability of its production systems.

AWS Java Kotlin Kubernetes Python
2 hours, 53 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers