ClickHouse

ClickHouse

ClickHouse provides a fast open source column-oriented database management system that enables users to generate real-time analytical data reports through SQL queries, catering to the needs of industries requiring efficient data processing and analysis.

IT Services
51-250
Founded 2021
$300M raised

Description

  • Lead reliability and operations for ClickHouse’s Postgres integration, including upgrades, patching, maintenance, and scaling.
  • Design and implement automation for provisioning, deployments, and service lifecycle management across AWS, GCP, and Azure.
  • Develop and maintain infrastructure-as-code using Terraform and modern CI/CD tooling to ensure consistent, repeatable deployments.
  • Build and contribute Go-based tooling and services to improve automation, observability, and developer experience.
  • Own observability and monitoring, including metrics, alerting, and tracing across environments.
  • Drive incident management, lead postmortems, and implement continuous improvement practices to strengthen reliability.
  • Collaborate cross-functionally with platform, networking, and product teams to improve service operability.
  • Mentor and enable engineers, improving team practices and scaling operational capability as customer adoption grows.

Requirements

  • 7+ years of experience in SRE, DevOps, or infrastructure engineering with a track record of running distributed, production-grade systems.
  • Solid understanding of Postgres operations, scaling, and performance tuning.
  • Deep hands-on experience with AWS and exposure to GCP and Azure, comfortable with multi-cloud topologies.
  • Proficient with Terraform and infrastructure-as-code practices.
  • Experience with Kubernetes and container-based infrastructure.
  • Strong Go development skills or willingness to write and own production Go code.
  • Familiarity with observability tooling such as Prometheus, Grafana, Loki, and OpenTelemetry (or equivalents).
  • Deep understanding of SLOs, incident response, and reliability engineering practices.
  • Experience with CI/CD tooling and automation for repeatable deployments.
  • Founder’s mentality: hands-on, resourceful, autonomous, and comfortable shipping impactful systems.

Benefits

  • Flexible, remote-friendly work environment (ClickHouse operates in ~20 countries).
  • Employer contributions toward healthcare.
  • Equity through stock options for new team members.
  • Flexible time off in the US and generous time-off entitlement in other countries.
  • $500 home office setup allowance for remote employees.
  • Opportunities for company-wide in-person offsites and global gatherings.
  • Typical starting salary range for US-based roles is provided (company notes market premiums may apply in certain locations and ranges may vary).

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Field Engineer | UK | Remote

Grafana 1K-5K IT Services

Senior Field Engineering Infrastructure role at Grafana Labs responsible for maintaining and developing the pre-sales demo kit and backend infrastructure, creating technical demos and training, and enabling the Solution Engineering team to scale adoption and close deals.

AWS Azure CI/CD Datadog Elasticsearch GCP Grafana Kubernetes Prometheus Splunk Terraform
1 month ago

Cloud / Platform Engineer (Remote)

Alex Staff Agency 11-50 Professional Services

Cloud/Platform Engineer at a U.S.-based EdTech company operating a global, high-load digital learning platform, responsible for maintaining production reliability and operating multi-region cloud and Kubernetes infrastructure.

AWS Bash CI/CD GCP Go Kubernetes Python Terraform
1 month ago

Customer Reliability Engineer

Sysdig 251-1K IT Services

Customer Reliability Engineer at Sysdig (remote, flexible for Italy/Spain) delivering senior-level technical support and escalation management to ensure customers run and secure cloud/container environments reliably.

AWS Azure Bash Cassandra Elasticsearch GCP Kafka Kubernetes Linux PostgreSQL Python Shell Scripting
1 month ago

Senior Site Reliability Expert

Lightspeed 1K-5K Professional Services

Senior Site Reliability Expert at Lightspeed (Retail) responsible for designing, building, and operating the infrastructure platform that empowers product teams to deliver scalable, highly available production environments and efficient software delivery pipelines.

Argo CD AWS CircleCI Docker DynamoDB GCP Go Jenkins Kubernetes Linux MySQL PostgreSQL Python Redis Ruby Shell Scripting Terraform
1 month ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers