Staff Site Reliability & DevOps Engineer - Observability

2 hours, 19 minutes ago
Full-time
Lead
DevOps and Infrastructure
Cision

Cision

Cision is a global provider of PR software and marketing solutions, empowering communication pros to engage audiences effectively.

Professional Services
5K-10K
Founded 1892
$83M raised

Description

  • Design, build, and operate observability platforms based on Grafana and Prometheus.
  • Define and maintain metrics standards, dashboards, alerts, and service level objectives (SLOs).
  • Improve signal quality by reducing alert noise, tuning thresholds, and enhancing runbooks.
  • Support incident response with actionable telemetry and post-incident analysis.
  • Integrate metrics, logs, and traces across distributed systems.
  • Work with engineering teams to instrument services correctly.
  • Automate observability configuration using infrastructure as code.
  • Contribute to reliability improvements through capacity planning and performance analysis.

Requirements

  • Strong experience with Prometheus, including scraping, federation, recording rules, and alerting.
  • Strong experience with Grafana, including dashboards, alerting, templating, and RBAC.
  • Solid Linux and networking fundamentals.
  • Experience running observability stacks in Kubernetes environments.
  • Infrastructure as code experience, with Terraform preferred.
  • Familiarity with incident management and on-call practices.
  • Ability to debug production systems using metrics and logs.
  • Experience with logs and traces such as Loki, Tempo, or OpenTelemetry is preferred.
  • Experience operating large-scale or multi-cluster Kubernetes platforms is preferred.
  • Experience with cloud platforms such as GCP, AWS, or OCI is preferred.
  • Exposure to SRE concepts such as error budgets and SLO-driven prioritisation is preferred.

Benefits

  • The role is based at Cision, a global company with offices in 24 countries.
  • The company promotes an inclusive environment where employees can be authentic and supported.
  • Cision is an equal opportunity employer.
  • Reasonable accommodations are available for candidates and employees with disabilities.
  • Access to a global candidate privacy commitment and hiring process protections.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer II ( Remote )

LivePerson 1K-5K Internet Software & Services

LivePerson is seeking a Mid-Level Site Reliability Engineer to join its global Platform Engineering team in India, focused on keeping cloud-native production systems reliable, scalable, and performant.

Agile Argo CD AWS Datadog Flux GCP GitOps Go Grafana Helm Kubernetes Linux PagerDuty Prometheus Python Scrum Shell Scripting Terraform
4 minutes ago

Senior Site Reliability Engineer, AI Research

Algolia 251-1K Internet Software & Services

Algolia is hiring an embedded Senior Site Reliability Engineer to support its AI Research team by ensuring the reliability and operability of cloud infrastructure that powers research and customer-facing AI systems.

Argo CD CI/CD Datadog GCP GitOps Go Kubernetes Python Terraform
1 hour, 34 minutes ago

Software Engineer - Search Platform

Algolia 251-1K Internet Software & Services

Algolia is hiring a Software Engineer to join the Metis team and help build and operate the cloud-based architecture behind its NeuralSearch AI search engine for large-scale distributed search and indexing.

Go Kubernetes
1 hour, 34 minutes ago

Staff Site Reliability Engineer

Alphasense 51-250 Industrial Conglomerates

AlphaSense is seeking a Staff Site Reliability Engineer to shape reliability, scalability, and performance for its AI-driven market intelligence platform and to advance operational excellence across a global engineering organization.

AWS Azure Datadog DNS GCP Go Grafana Kubernetes Load Balancing OpenTelemetry Prometheus Python TCP/IP
1 hour, 40 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers