Senior SRE - Platform (Managed Kubernetes Infrastructure)

8 hours, 8 minutes ago
Full-time
Senior
Software Development
Elastic

Elastic

Elastic is a leading platform for search-powered solutions, providing real-time insights and making data usable for developers and enterprises worldwide.

Internet Software & Services
1K-5K
Founded 2010

Description

  • Lead technical initiatives to automate system engineering work and improve the reliability of Elastic's global infrastructure.
  • Design, build, scale, and mature the multi-cloud platform used to host internal and external services.
  • Develop and maintain software, tooling, and automations that expand platform and infrastructure capabilities.
  • Respond to major incidents, prevent repeated customer impact, and drive prioritized problem management.
  • Collaborate with engineers and cross-functional partners to identify, implement, and deliver operational solutions.
  • Support operational excellence and help uplift others within an inclusive, distributed team environment.
  • Participate in the follow-the-sun on-call rotation during mostly working hours.

Requirements

  • Experience solving operational problems from a Site Reliability Engineering perspective with a customer-first mindset.
  • Background in software engineering and the ability to collaborate with engineers on solutions, ideally using Golang.
  • Production experience with public cloud service providers.
  • Experience managing Kubernetes infrastructure at scale.
  • Ability to communicate inclusively and work effectively in distributed or remote teams.
  • Bonus/preferred experience operating SaaS products in public cloud environments using Infrastructure-as-Code tools such as Crossplane or Terraform.
  • Bonus/preferred experience building or operating Kubernetes-at-scale infrastructure across multiple cloud providers.
  • Bonus/preferred experience working with containerized services such as Docker.
  • Bonus/preferred experience improving alerting, incident management, and metrics systems such as Elastic Stack, Prometheus, or Influx.
  • Bonus/preferred experience in Linux system administration on distributed systems at scale.
  • Bonus/preferred experience diagnosing, designing, implementing, or creating solutions with the Elastic Stack.
  • Bonus/preferred experience coaching and mentoring teammates in a globally distributed environment.

Benefits

  • Base salary range of CAD $148,300 to $185,600.
  • Eligible to participate in Elastic's stock program.
  • Company-matched RRSP with dollar-for-dollar matching up to 6% of eligible earnings.
  • Health coverage for you and your family in many locations.
  • Flexible locations and schedules for many roles.
  • Generous number of vacation days each year.
  • Up to 40 hours per year for volunteer projects.
  • Minimum of 16 weeks of parental leave.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer

Dropbox 1K-5K Internet Software & Services

Dropbox is hiring a Corporate Site Reliability Engineer to lead infrastructure reliability, observability, automation, and security for its IT Services environment.

Ansible AWS Bash Chef Datadog DHCP DNS Docker EC2 GitHub GitHub Actions GitOps Kubernetes Linux Python REST API Serverless Terraform Ubuntu WAF
8 hours, 23 minutes ago

Senior Platform Engineer

Veeam Software 1K-5K Internet Software & Services

Veeam is hiring a Platform Engineer for Veeam Data Cloud to build and operate the platform that enables teams to build, test, deploy, and monitor its cloud product.

AWS Azure Bash Docker Git GitHub Actions Go Helm Java Kubernetes Microservices Pulumi Python Serverless Terraform
8 hours, 38 minutes ago

Senior Observability Engineer

Ensono 1K-5K IT Services

Ensono is hiring an observability and monitoring engineer to operate and improve hybrid cloud monitoring platforms for enterprise clients, with the goal of delivering real-time visibility, reliable alerting, and compliant monitoring operations.

Ansible AWS Azure Bash Datadog GCP JavaScript Kubernetes Python Terraform
8 hours, 53 minutes ago

Senior SRE Engineer (Observability Focus)

Capital.com 251-1K Capital Markets

Senior SRE Engineer at a leading trading platform, owning the company’s observability practice end to end for a hybrid AWS and on-prem production environment.

Ansible Argo CD AWS Bash Elasticsearch Fluentd GitOps Grafana Helm Java JavaScript Kafka Kubernetes OpenSearch OpenTelemetry Prometheus Python Terraform TypeScript
8 hours, 53 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers