Grafana

Grafana

Grafana is the open observability platform providing analytics, monitoring, and visualization solutions with a focus on user control and cost efficiency.

IT Services
1K-5K
Founded 2014
$535M raised

Description

  • Design, build, and operate reconciliation systems that track desired stack state and detect and repair configuration drift.
  • Collaborate across SSS, grafana.com, deployment configurations, and adjacent teams to keep stack lifecycle workflows reliable and resilient.
  • Improve operational efficiency by simplifying deployment and rollout processes for stack services.
  • Manage rollout mechanisms for plugins, dashboards, data sources, Grafana versions, release channels, and stack-level configuration.
  • Support new region and cluster rollouts and the operational paths required to bring stacks online safely.
  • Improve incident response and recovery for stack misalignment, reconciliation failures, rollout issues, and integration failures.
  • Partner with Product, Hosted Grafana, Infrastructure, Support, and other AppCore squads on customer-impacting lifecycle work.
  • Contribute to roadmap planning, technical design, on-call improvements, and long-term simplification of stack operations.
  • Own the production behavior of the systems you build by improving runbooks, dashboards, alerts, safety controls, and recovery procedures.
  • Write efficient, readable, maintainable code and implement new microservices or systems as needed.

Requirements

  • At least 1 year of fully remote work experience.
  • Professional experience with Golang.
  • Experience working on a SaaS platform.
  • Familiarity with distributed systems concepts such as scalability, multi-tenancy, and high availability.
  • Ability to work across both backend service and application code.
  • Strong focus on developer experience, user experience, and product quality.
  • Experience contributing to projects from initial brainstorming through delivery.
  • Ability to write clean, well-tested software that is easy for other engineers to operate and maintain.
  • Experience breaking down well-defined tasks into iterative deliveries and gathering feedback.
  • Willingness to collaborate across teams and align work with other squads and external stakeholders.
  • Familiarity with Kubernetes in AWS, GCP, or Azure.
  • Exposure to infrastructure-as-code tools such as Helm, Terraform, or Jsonnet.
  • Experience participating in blameless incident response and post-incident reviews.
  • Experience with TypeScript/Node.js is a plus.
  • Experience with Kubernetes control-plane patterns, operators, reconcilers, or desired-state systems is a plus.
  • Experience with Jsonnet/Tanka, Terraform, Flux, Argo, or similar deployment/configuration tooling is a plus.
  • Experience with SaaS provisioning, tenancy, regional expansion, plugin rollout, or customer lifecycle systems is a plus.
  • Experience with incident response involving configuration drift, partial failure, or cross-service state mismatch is a plus.

Benefits

  • UK compensation range of GBP 72K - GBP 90K.
  • Restricted Stock Units (RSUs).
  • 100% remote, global work culture.
  • 30 days of annual leave per year, including 3 Grafana Shutdown Days.
  • In-person onboarding.
  • Career growth pathways and development opportunities.
  • Transparent communication and approachable leadership.
  • High trust, low ego, innovation-driven environment.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Développeuse ou développeur côté serveur (backend) Staff — Systèmes d’inférence ML / Staff Backend Engineer, ML Inference Systems

Unity 5K-10K Internet Software & Services

Unity is hiring a Staff Backend Engineer for the Vector Gamer AI team to build and operate the distributed inference infrastructure behind ad ranking and bidding decisions at massive scale.

CI/CD Docker GCP Go Grafana Kubernetes Machine Learning Microservices Prometheus
27 minutes ago

Senior AI Platform Engineer

Wellhub 1-10 Gas Utilities

Wellhub is hiring a Senior AI Platform Engineer in Brazil to help build and evolve the cloud-native ML development platform that enables engineers and data scientists to develop and deploy AI at scale.

Apache Spark AWS CI/CD Kubeflow Kubernetes MLOps Python Terraform
27 minutes ago

Senior Backend Engineer (Elixir)

Remote 251-1K Professional Services

Remote is hiring a fully remote engineer to work on tools, APIs, and integrations for its HR and Payroll products within cross-functional teams operating asynchronously worldwide.

Angular AWS CI/CD Docker Elixir GitHub GitLab Jenkins Kubernetes Microservices Next.js Phoenix PostgreSQL React Vue.js
27 minutes ago

Senior Backend Engineer (Elixir)

Remote 251-1K Professional Services

Remote is hiring a full-time engineer to build tools, APIs, and integrations for its globally distributed HR and Payroll products in a fully remote, cross-functional environment.

Angular AWS CI/CD Docker Elixir GitHub GitLab Jenkins Kubernetes Next.js Phoenix PostgreSQL React Vue.js
42 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers