Senior Software Engineer - Grafana Databases, Managed Services | UK | Remote

3 hours, 2 minutes ago
Full-time
Senior
DevOps and Infrastructure
Grafana

Grafana

Grafana is the open observability platform providing analytics, monitoring, and visualization solutions with a focus on user control and cost efficiency.

IT Services
1K-5K
Founded 2014
$535M raised

Description

  • Operate and evolve 100+ multi-cloud streaming clusters and related database infrastructure in production.
  • Diagnose and eliminate cross-layer failure modes affecting storage, control planes, query performance, and scaling.
  • Design safe upgrade and rollout strategies across large-scale production environments.
  • Improve observability, automation, and operational ergonomics for core systems.
  • Partner with database and platform teams on scaling, partitioning, consumer fan-out, and performance.
  • Work with distributed systems behavior, Kubernetes scheduling, storage engines, and compression trade-offs.
  • Serve as a primary escalation point and participate in on-call coverage for incidents.
  • Own relationships with system vendors, including WarpStream Labs and other providers.
  • Review pull requests and contribute to design documents, tooling, automation, and code improvements.
  • Participate in incident investigation, resolution, and post-incident reviews.

Requirements

  • 6+ years of engineering experience, including time in SRE, platform engineering, production engineering, infrastructure engineering, or distributed systems roles.
  • Experience operating distributed systems in production, such as streaming systems, analytical databases, or large-scale storage backends.
  • Strong Kubernetes experience in AWS, GCP, or Azure.
  • Familiarity with infrastructure-as-code tools such as Helm, Terraform, or Jsonnet.
  • Solid understanding of distributed systems design and large-scale trade-offs.
  • Proficiency in at least one programming language; Go is preferred.
  • Working knowledge of Linux internals, networking, cloud storage, and performance/scaling behavior.
  • Experience with blameless incident response and writing high-quality post-incident reviews.
  • Ability to communicate clearly, collaborate across teams, and work autonomously.
  • Preferred experience with systems such as Kafka, Redpanda, WarpStream, Postgres, ClickHouse, Snowflake, or Cassandra.

Benefits

  • Base salary range of GBP 91,755 to GBP 110,106 in the UK.
  • Equity included as part of the compensation package.
  • Bonus eligibility, if applicable.
  • Restricted Stock Units (RSUs) for all roles.
  • 100% remote, global working environment.
  • Global annual leave policy of 30 days per year, including 3 Grafana Shutdown Days.
  • Company-funded access to modern AI coding assistants and frontier models within security guidelines.
  • In-person onboarding for new hires.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer

66degrees 251-1K IT Services

66degrees is hiring a Site Reliability Engineer to help enterprise cloud clients maintain, optimize, and scale Google Cloud environments through reliability engineering, automation, and incident response.

Agile Datadog GCP JIRA Kanban Kubernetes Linux Prometheus Python Scrum SQL Server Terraform
2 hours, 32 minutes ago

Site Reliability Engineer

Arbor 51-250 IT Services

Arbor is hiring a Remote Site Reliability Engineer to help ensure platform resilience, performance, availability, and scalable service delivery across its school management systems.

Agile Datadog Docker Kanban Nginx Prometheus Terraform
2 hours, 47 minutes ago

Senior Data Engineer II, Finance

instacart.careers 1K-5K Internet Software & Services

Instacart is hiring a Finance Data Engineer to build and own critical financial data infrastructure and reporting pipelines that support accounting, billing, revenue, and finance operations across its marketplace platform.

Apache Airflow Apache Spark dbt Python Snowflake SQL
3 hours, 2 minutes ago

Data Development & Support Analyst - Fixed Term Contract

Livestock Information 11-50 Professional Services

Livestock Information Ltd is hiring a Data Development & Support Analyst on a 12-month fixed-term contract to support and improve its Azure-based data platform, reporting services, and delivery processes.

Agile Azure CI/CD Databricks Power BI Python Scrum SQL
3 hours, 2 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers