Sumo Logic

Sumo Logic

Sumo Logic offers top-tier cloud monitoring, log management, and Cloud SIEM tools for web and SaaS apps, empowering businesses with real-time insights and high-quality software delivery.

Internet Software & Services
251-1K
Founded 2010

Description

  • Improve the lifecycle of microservices and related architectural components from design through deployment, operation, and refinement.
  • Define, evolve, and manage service level objectives (SLOs).
  • Write code and automation to reduce operational workload, improve efficiency, strengthen security posture, and eliminate toil.
  • Scale systems sustainably through automation and reliability-focused improvements.
  • Facilitate blame-free root cause analysis meetings and drive learning from incidents.
  • Participate in and improve global incident response coordination across products.
  • Drive root cause identification and issue resolution with cross-functional teams.
  • Work closely with multiple teams to optimize the operations of their microservices.
  • Operate in a fast-paced, iterative environment.

Requirements

  • 6+ years of industry experience.
  • Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or another scientific or technical discipline.
  • Cloud-native application development experience using best practices and design patterns.
  • Strong debugging and troubleshooting skills across the full technology stack.
  • Deep understanding of AWS networking, compute, storage, and managed services.
  • Experience with modern CI/CD tooling such as Kubernetes, Terraform, Ansible, and Jenkins.
  • Experience with full lifecycle support of services, from creation to production support.
  • Infrastructure as Code experience with tools such as Terraform or AWS CloudFormation.
  • Ability to author production-ready code in at least one of Java, Scala, or Go.
  • Experience with Linux systems and command-line work.
  • Understanding of modern cloud-native software security practices.
  • Experience with agile frameworks such as Scrum and Kanban.
  • Flexibility to step into new roles and responsibilities.
  • Willingness to learn and use Sumo Logic products to solve reliability and security issues.
  • Preferred: experience using Sumo Logic or other observability products for reliability and security.
  • Preferred: experience with planet-scale product development.
  • Preferred: expert-level experience running and operating SaaS products on AWS.
  • Preferred: experience with streaming technologies such as Kafka, Kafka Streams, or KSQL.
  • Preferred: expert-level experience in one or more of Java, Go, Scala, or Python.
  • Preferred: expert-level experience in one or more of Terraform, Jenkins, or Kubernetes.
  • Preferred: extensive experience running and tuning JVM workloads at scale.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff Site Reliability Engineer, Fabric

MongoDB 1K-5K Internet Software & Services

MongoDB is hiring a Site Reliability Engineer for its Fabric team to build and operate the multi-cloud network infrastructure that enables secure, reliable communication between services and the public internet.

AWS Azure DNS GCP Kubernetes Load Balancing TCP/IP TLS
2 hours, 13 minutes ago

Staff Site Reliability Engineer, Database

Alpaca 51-250 Capital Markets

Alpaca is hiring a Site Reliability Engineer to keep its large-scale brokerage infrastructure reliable, scalable, and high-performing across database and production systems.

Go Linux PostgreSQL Prometheus
2 hours, 13 minutes ago

Assoc, Protocol Engineer (Chainlink)

Galaxy 251-1K Capital Markets

Galaxy is hiring an experienced Protocol, DevOps, or SRE Engineer to help build and operate secure blockchain infrastructure supporting its digital assets platform and custody offerings.

AWS Azure Bash Blockchain C C++ Datadog Docker ELK Stack Encryption Ethereum GCP Go Grafana Java Kubernetes Linux Network Security Perl Prometheus Python Rust Solana Terraform
2 hours, 43 minutes ago

Staff Site Reliability Engineer, Fabric

MongoDB 1K-5K Internet Software & Services

MongoDB is hiring a Site Reliability Engineer for its Fabric team to build and operate the multi-cloud networking infrastructure that keeps service-to-service communication secure, reliable, and globally connected.

AWS Azure CDN DNS GCP Kubernetes Load Balancing MongoDB TCP/IP TLS
3 hours, 58 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers