Alphasense

Alphasense

Alphasense is a global leader in providing high-quality gas sensors and air quality monitors to industrial OEMs. With over 25 years of experience, the company offers a wide range of innovative gas sensor technologies for various applications, including...

Industrial Conglomerates
51-250
Founded 1996

Description

  • Architect reliability frameworks and self-service tooling that enable teams to own the reliability of their services.
  • Lead the company’s AIOps strategy by automating diagnostics, remediation, and proactive failure prevention.
  • Embed SRE practices across engineering through design reviews, production readiness reviews, and operational standards.
  • Serve as Incident Commander during critical incidents and drive blameless postmortems that lead to lasting improvements.
  • Deliver end-to-end observability across monitoring, tracing, and profiling to proactively improve performance.
  • Mentor and support engineers across SRE and product teams through technical guidance and knowledge sharing.
  • Influence architectural decisions and set technical standards for reliability across the company.

Requirements

  • 8+ years of experience in Site Reliability Engineering, DevOps, or a similar role.
  • At least 3+ years operating in a Senior+ SRE position.
  • Strong experience running production SaaS systems at scale.
  • Proficiency in at least one programming or scripting language such as Python or Go.
  • Hands-on experience with cloud platforms such as AWS, GCP, or Azure and Kubernetes.
  • Deep understanding of networking fundamentals including TCP/IP, DNS, HTTP/S, and load balancing.
  • Experience with monitoring and alerting tools such as Prometheus, Grafana, Datadog, or ELK.
  • Familiarity with advanced observability tools such as OTEL and continuous profiling.
  • Proven incident management experience, including leading high-severity incidents and postmortems.
  • Strong troubleshooting, communication, and collaboration skills.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Applications Support Specialist

Ensono 1K-5K IT Services

Application Reliability Lead at an enterprise in a regulated environment, responsible for restoring service during incidents and improving the resilience, stability, and operational readiness of critical applications.

Grafana Java .NET PowerShell Prometheus Python Splunk SQL
14 minutes ago

Remote in Brazil - Senior DevOps & Cloud/SRE

Stack Builders 51-250 Internet Software & Services

Stack Builders is hiring a Senior DevOps & Cloud/SRE Engineer to design and optimize secure, scalable infrastructure for client projects across the U.S., U.K., and Australia.

Ansible AWS Azure Bash CI/CD CircleCI CloudFormation Docker EC2 GCP GitHub Actions GitLab CI GitOps Go Jenkins Kubernetes Linux MongoDB MySQL PostgreSQL Pulumi Python Redis Secrets Management Terraform
19 minutes ago

Reliability Engineer, Energy Storage

Redwood Materials 251-1K Industrial Conglomerates

Redwood Materials is hiring a Reliability Engineer, Energy Storage to help define and validate the reliability of new hardware products for its battery and energy storage systems.

Python SEM
59 minutes ago

Senior Database Reliability Engineer

Rithum Internet Software & Services

Rithum is seeking a Senior Database Reliability Engineer to manage and improve the reliability, availability, and observability of its large-scale hybrid database environment supporting e-commerce operations.

AWS CI/CD DynamoDB Elasticsearch MongoDB MySQL PostgreSQL PowerShell Python Redis SQL Server
59 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers