Zscaler

Zscaler is a cybersecurity pioneer providing industry-leading CASB and SASE solutions, revolutionizing internet security with a cloud-based platform that protects users worldwide.

Internet Software & Services

Information Technology

1K-5K (4975)

Founded 2007

111 open positions

Links

View All Jobs

Staff Site Reliability Engineer

16 hours, 32 minutes ago

United States

Full-time

Lead

Site Reliability Engineer (SRE)

DevOps and Infrastructure

Bash CI/CD DNS Go HTTP Kubernetes Linux Load Balancing OpenTelemetry Prometheus Python TCP/IP

Apply Now

Zscaler

Zscaler is a cybersecurity pioneer providing industry-leading CASB and SASE solutions, revolutionizing internet security with a cloud-based platform that protects users worldwide.

Internet Software & Services

1K-5K

Founded 2007

View All Jobs 111

Description

Own the reliability of a large-scale cloud service across Linux/BSD, bare metal, Kubernetes, custom load balancing, and SD-WAN.
Partner with Engineering and Network teams early to define requirements, conduct operability reviews, and contribute code and design documentation for resilience.
Develop and operate end-to-end observability, including metrics, logs, traces, dashboards, alerting, and incident tooling.
Manage SLOs and error budgets while reducing noise and improving system detection and diagnosis.
Participate in on-call rotation and lead full-cycle incident response.
Perform deep cross-stack troubleshooting across operating systems, networking, distributed systems, packet captures, and core dumps.
Drive permanent software fixes and convert incident learnings into runbooks and tests.
Build and maintain infrastructure and service lifecycle automation using everything-as-code.
Drive provisioning, configuration, release automation, canary deployments, and rollout/rollback workflows.
Improve platform hygiene through OS and application upgrades, patching, capacity tuning, performance tuning, and CI/CD validation before production rollouts.

Requirements

US citizenship is required due to the nature of assigned customers.
5+ years of industry experience in software engineering, infrastructure software, and/or platform engineering.
Proficiency in at least one programming language such as Python, Bash, or Go.
Demonstrated ability to write production-quality code, including testing, code reviews, CI, and maintainable design.
Strong Linux/Unix systems fundamentals, including processes, memory, filesystems, networking basics, and debugging/performance troubleshooting.
Solid understanding of networking protocols and concepts such as HTTP, DNS, TCP/IP, ICMP, the OSI model, subnetting, and load balancing.
Proven experience operating production services, including incident response, troubleshooting, and reducing toil.
Ability to participate in on-call rotations and support occasional after-hours or weekend deployments.
Experience managing BSD in production and driving systemic fixes through platform engineering.
Preferred: proven expertise in operating Kubernetes at scale.
Preferred: deep experience with Prometheus and OpenTelemetry, including golden signals, SLOs, and alert tuning.

Benefits

Base salary range of $119,000 to $170,000 USD.
Eligibility for commission, bonus, and equity, if applicable.
Various health plans.
Time off plans for vacation and sick time.
Parental leave options.
Retirement options.
Education reimbursement.
In-office perks and hybrid/remote work flexibility.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Database Reliability Engineer

Sporty Group 51-250 Media

Sporty is seeking a Database Reliability Engineer to own and improve its database infrastructure supporting multiple platforms and international expansion.

Europe Latin America Full-time Mid Level Database Administrator Site Reliability Engineer (SRE)

Ansible Argo CD Elasticsearch GitHub Actions Go Grafana Helm Jenkins Kubernetes MongoDB MySQL PostgreSQL Prometheus Python RabbitMQ Terraform

8 hours, 2 minutes ago

Apply

8 hours, 2 minutes ago

Senior Site Reliability Engineer

Moniepoint 1K-5K Diversified Financial Services

Moniepoint is hiring an experienced Site Reliability Engineer to improve the reliability, scalability, and observability of its highly distributed financial platform serving emerging markets.

Nigeria Full-time Senior Site Reliability Engineer (SRE)

AWS Azure Datadog GCP Go Java Kafka Kubernetes Microservices MySQL New Relic OpenTelemetry PostgreSQL Prometheus Python RabbitMQ Rust

8 hours, 47 minutes ago

Apply

8 hours, 47 minutes ago

Senior Site Reliability Engineer, Identity Platform

Coinbase 1K-5K Capital Markets

Coinbase is hiring an experienced Site Reliability Engineer to build and scale identity and access management tooling for its IT Operations Corporate Engineering team supporting cloud-based, security-first systems.

United States Full-time Senior Site Reliability Engineer (SRE)

$186k-$219k

Ansible AWS Azure C# CI/CD Docker GCP Go Java Kubernetes Python Ruby Secrets Management Terraform

9 hours, 17 minutes ago

Apply

9 hours, 17 minutes ago

Database Reliability Engineer - Core Team

ClickHouse 51-250 IT Services

ClickHouse is hiring a Site Reliability Engineering team member for ClickHouse Core to improve the reliability, availability, scalability, and performance of ClickHouse Cloud for customers worldwide.

Australia Full-time Senior Site Reliability Engineer (SRE)

AWS Azure C++ ClickHouse GCP Python SQL

9 hours, 47 minutes ago

Apply

9 hours, 47 minutes ago

Zscaler

Tags

Links

Staff Site Reliability Engineer

Zscaler

Description

Requirements

Benefits

Similar Roles

Database Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer, Identity Platform

Database Reliability Engineer - Core Team

You're on a roll! Sign up now to keep applying.