Site Reliability Engineer, Infrastructure Shared Services

2 hours, 50 minutes ago
Full-time
Senior
DevOps and Infrastructure
Pure Storage

Pure Storage

Pure Storage is a top all-flash enterprise storage company that simplifies data storage with modern, easy-to-manage solutions, driving business and IT transformation for organizations worldwide.

IT Services
1K-5K
Founded 2009

Description

  • Own and support Everpure’s infrastructure, internal tooling, and production services.
  • Design, operate, maintain, and troubleshoot enterprise systems including databases, message queues, APIs, and distributed applications.
  • Use data, metrics, SLOs, and error budgets to improve reliability and operational performance.
  • Establish and practice sustainable incident response and blameless postmortems to prevent recurrence.
  • Support services before launch through system design, software platform and framework development, capacity planning, and launch reviews.
  • Scale systems sustainably through scripting and automation and improve operational management reliability and velocity.
  • Collaborate with internal engineering teams, business units, and distributed teams across multiple time zones to deliver customer outcomes.
  • Help define and improve observability, application data management, and incident management using modern technologies, including AI, to reduce engineering toil.

Requirements

  • 5+ years of experience as a Site Reliability Engineer, DevOps Engineer, or Infrastructure Engineer.
  • Demonstrated coding ability in a functional or object-oriented language, with Python or Go preferred.
  • Experience designing, implementing, delivering, and maintaining large-scale distributed software systems.
  • Strong understanding of Unix/Linux.
  • Experience with infrastructure-as-code and automation tools such as ArgoCD, Ansible, Terraform, or CloudFormation.
  • Experience with containers and container orchestration systems, particularly Kubernetes.
  • Experience working in hybrid environments, including bare metal and public cloud, with AWS preferred.
  • Experience analyzing performance and troubleshooting distributed systems.
  • Ability to work in a 24x7 on-call rotation using a follow-the-sun model, approximately 1 week every 2–3 months.
  • Strong communication skills, a systematic problem-solving approach, ownership, drive, and the ability to prioritize independently and follow through to completion.

Benefits

  • Flexible time off.
  • Wellness resources.
  • Company-sponsored team events.
  • Opportunities for growth and development.
  • Supportive team culture with a focus on collaboration and low ego.
  • Recognition as a Great Place to Work and other workplace awards.
  • Accommodations available for candidates with disabilities during the hiring process.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Database Reliability Engineer

Rithum Internet Software & Services

Rithum is hiring a Senior Database Reliability Engineer to manage and improve the availability, reliability, observability, and security of database systems across a large hybrid environment.

AWS CI/CD DynamoDB Elasticsearch MongoDB MySQL PostgreSQL PowerShell Python Redis SQL Server
34 minutes ago

Senior Site Reliability Engineer

Apptronik 51-250 Aerospace & Defense

Apptronik is hiring a Site Reliability Engineer to own and maintain cloud infrastructure deployments to customer sites for Apollo, its AI-powered humanoid robotics platform.

Ansible C++ Grafana Helm Kubernetes Linux PagerDuty Python Terraform TypeScript
1 hour, 20 minutes ago

Site Reliability Engineer (Application Software)

SpaceX 10K-50K Aerospace & Defense

SpaceX is hiring a Site Reliability Engineer for its application software team to build and operate mission-critical platforms that speed vehicle software delivery, testing, and operations across Falcon 9, Starship, Dragon, and Starlink.

Ansible C# C++ ClickHouse Docker JavaScript Kubernetes Linux MySQL PostgreSQL Puppet Python Terraform
2 hours, 50 minutes ago

Principal Software Engineer II - Observability

Elastic 1K-5K Internet Software & Services

Elastic is seeking a Principal Software Engineer to serve as a Tech Lead on the Observability Experience Team, shaping end-to-end experiences for logs, metrics, and traces across the company’s cloud-based Search AI platform.

2 hours, 50 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers