CloudLinux

CloudLinux is a leading provider of the CloudLinux OS, a platform for Linux web hosting that offers next-level performance and security. With a focus on optimizing web hosting environments, CloudLinux helps service providers improve density, stability,...

IT Services

Information Technology

51-250 (210)

Founded 2009

18 open positions

Links

View All Jobs

Senior Database Reliability Engineer (DBRE) (worldwide remote)

2 months ago

Slovakia, Spain, Georgia, Montenegro, Poland, Armenia

Full-time

Senior

Database Administrator

DevOps and Infrastructure

Ansible ClickHouse DNS GitLab Grafana JIRA Linux MongoDB OpsGenie PostgreSQL Redis Terraform TLS

Apply Now

CloudLinux

IT Services

51-250

Founded 2009

View All Jobs 18

Description

Own production PostgreSQL reliability, including HA design, Patroni, PgBouncer, replication, failover, upgrades, vacuum and bloat control, query tuning, locks, indexes, capacity, backups, PITR, and restore validation.
Improve disaster recovery by maintaining tested restores, documented recovery paths, measurable RTO/RPO targets, runbooks, and safe maintenance plans.
Support and troubleshoot the wider database estate, including ClickHouse, MongoDB, and Redis, while improving monitoring and access/data-safety controls.
Automate DBA workflows using Ansible, Terraform/OpenTofu, GitLab CI/CD, scripts, and reproducible runbooks for provisioning, grants, backups, restores, health checks, and ownership metadata.
Help build DBaaS-style self-service capabilities so engineering teams can request databases, access, credentials, and operational checks with less manual DBA intervention.
Improve observability and incident response through Grafana, metrics, logs, SLOs, alert rules, Opsgenie routing, and clear communication during production issues.
Work closely with engineering teams to reduce repeated DBA tickets and improve reliability, safety, and operational resilience.
Learn and operate the existing production ClickHouse environment safely and effectively.
Maintain clear documentation, evidence, and ownership for database incidents and recovery processes.

Requirements

Deep hands-on PostgreSQL experience in business-critical production environments, typically 5+ years or equivalent depth.
Strong understanding of PostgreSQL internals and operations, including MVCC, WAL, transactions, locks, indexes, query planning, replication, autovacuum, bloat, major upgrades, backups, PITR, and restore testing.
Proven experience with highly available databases and the ability to reason about quorum, split-brain risk, failover, rollback, and recovery.
Strong Linux and infrastructure fundamentals, including systemd, networking, storage, filesystems, CPU/memory/disk bottlenecks, TLS, DNS, firewalls, and root-cause troubleshooting.
Automation skills with Ansible and scripting.
Terraform/OpenTofu, GitLab CI/CD, and merge-request based delivery are strong advantages.
Ability to support more than one database engine and learn ClickHouse quickly even without day-one expertise.
Practical use of AI engineering assistants such as Claude and Codex, with careful personal verification of generated SQL, commands, scripts, and conclusions.
Clear written English for asynchronous work in Jira, Slack, GitLab, Slite, and runbooks.
ClickHouse operations experience, including replication, Keeper/ZooKeeper, MergeTree engines, distributed DDL, grants, row policies, backups, query troubleshooting, and cluster recovery, is preferred.
MongoDB replica sets and Percona Backup for MongoDB experience is preferred.
Redis/Sentinel and broker/cache failure mode experience is preferred.
Database observability, SLOs, golden signals, alert tuning, and executable incident runbooks are preferred.
Experience building internal platforms, self-service portals, or DBaaS workflows is preferred.

Benefits

Fully remote work with flexible working hours and the ability to work from any location worldwide.
Paid 24 days of vacation per year, plus 10 days of national holidays and unlimited sick leave.
Compensation for private medical insurance.
Co-working and gym/sports reimbursement.
Budget for education and professional development.
Opportunity to receive a reward for the most innovative idea that the company can patent.
Interesting and challenging projects with real production impact.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Data Migration Engineer (SQL Server / TSQL)

Cresteo 51-200 information technology & services

Cresteo is hiring a Data Migration Engineer to own end-to-end SQL Server data migrations from legacy schemas into reconciled, production-ready data for US-based clients and international teams.

Latin America Full-time Senior Database Administrator Data Engineer

JSON SQL Server

13 hours, 49 minutes ago

Apply

13 hours, 49 minutes ago

Senior Site Reliability Engineer

Counterpart Health 51-200 hospital & health care

Counterpart Health is hiring a Senior Site Reliability and Infrastructure Engineer to support and evolve the technology platform behind its primary care tool and maintain reliable infrastructure for domestic and international workloads.

United States Full-time Senior Site Reliability Engineer (SRE)

$160k-$208k

AWS Azure CI/CD Containerd DNS Docker GCP Go gRPC Helm Kubernetes Linux Load Balancing Prometheus Python Shell Scripting TCP/IP

14 hours, 19 minutes ago

Apply

14 hours, 19 minutes ago

Senior Test Platform & Reliability Engineer - Star Trek Fleet Command

Scopely 1K-5K Internet Software & Services

Scopely is hiring a Senior Test Platform & Reliability Engineer in Ireland to build validation, reliability, and developer enablement platforms for Star Trek Fleet Command’s large-scale live-service backend systems.

Ireland Full-time Senior SDET (Software Development Engineer in Test) Site Reliability Engineer (SRE)

AWS Bash CI/CD Docker GitLab Go Python Terraform

14 hours, 34 minutes ago

Apply

14 hours, 34 minutes ago

Senior Software Engineer - Storage Engine - Elasticsearch

Elastic 1K-5K Internet Software & Services

Elastic is seeking a Senior Software Engineer for the globally distributed Elasticsearch Storage Engine team to improve core storage and query technologies for observability workloads.

Norway Full-time Senior Database Administrator Software Engineer

$86k-$136k

CI/CD Elasticsearch GitHub Java MongoDB PostgreSQL Prometheus Solr

1 day, 13 hours ago

Apply

1 day, 13 hours ago

CloudLinux

Tags

Links

Senior Database Reliability Engineer (DBRE) (worldwide remote)

CloudLinux

Description

Requirements

Benefits

Similar Roles

Data Migration Engineer (SQL Server / TSQL)

Senior Site Reliability Engineer

Senior Test Platform & Reliability Engineer - Star Trek Fleet Command

Senior Software Engineer - Storage Engine - Elasticsearch

You're on a roll! Sign up now to keep applying.