Tinybird

Tinybird is a real-time analytics platform that enables data teams and developers to build low-latency APIs in minutes using SQL. It ingests millions of rows per second and serves high-concurrency analytical queries, helping businesses turn data into r...

IT Services

Information Technology

11-50 (30)

Founded 2019

$40M raised

2 open positions

Links

View All Jobs

Site Reliability Engineer

2 weeks, 6 days ago

Spain, Europe

Full-time

Senior

Site Reliability Engineer (SRE)

DevOps and Infrastructure

Ansible AWS C++ ClickHouse GCP Grafana Kubernetes Linux Load Balancing Python Redis REST API SQL Terraform

Apply Now

Tinybird

IT Services

11-50

Founded 2019

$40M raised

View All Jobs 2

Description

Design, build, and evolve a self-service platform that automates capacity decisions and autoscaling for a large-scale distributed system.
Operate and optimize infrastructure and software to ensure high availability, reliability, and elasticity as the customer base grows.
Work closely with product and backend teams to design system architecture, optimize resource usage, and make the platform more autonomous.
Participate in on-call rotations to investigate incidents, understand client-facing issues, and improve the on-call experience and disaster recovery tooling.
Tune and extract maximum performance from ClickHouse and other core data services, including understanding their internals when needed.
Design, operate, and tune production-grade Kubernetes clusters, including writing custom controllers/operators and configuring autoscaling mechanisms (e.g., KEDA, Karpenter).
Improve observability from low-level resource usage to high-level service metrics and implement better monitoring and alerting.
Automate provisioning and deployments, and collaborate on infrastructure-as-code and config management workflows (Terraform, Ansible).
Write maintainable software, document key insights and solutions, and communicate effectively with the team both asynchronously and in direct interactions.

Requirements

Proven experience designing, building, and running distributed cloud architectures and large-scale web-based applications.
Strong programming skills with willingness to dive into codebases and native components (experience with Python and C++ preferred).
Deep expertise in Kubernetes, including designing/operating clusters, custom controllers/operators, and autoscaling tuning (KEDA, Karpenter).
Experience with ClickHouse or deploying database systems at scale is a strong plus and ability to reason about database internals for performance tuning.
Proficiency working with AWS and GCP cloud providers.
Familiarity with Linux-based production environments.
Experience or familiarity with OpenResty, Varnish, Redis, Terraform, and Ansible is beneficial.
Ability to think in systems terms, anticipate edge cases and failure modes, and take ownership of reliability and incident resolution.
Willingness to participate in on-call rotations and be located in EU timezones.

Benefits

Salary range €62,000 - €109,000 per year plus a stock options grant.
22 days of holiday per year, plus your birthday and public holidays.
Remote-first work with the freedom to work from wherever suits you best.
Up to €2,400 to help set up your home workspace.
Access to company offices in Madrid and New York City when needed.
Early-stage company impact and transparency into company decisions and roadmaps.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

SRE / Platform Reliability Architect

NEORIS 5K-10K Internet Software & Services

EPAM NEORIS is seeking an SRE/Platform Reliability Architect to lead platform reliability and resiliency design, incident response, and cross-functional alignment for digital transformation initiatives.

Colombia Full-time Lead Site Reliability Engineer (SRE)

CI/CD Grafana Kubernetes OpenTelemetry Prometheus Terraform

56 minutes ago

Apply

56 minutes ago

Contract: Senior Site Reliability Engineer

Newsela 251-1K Diversified Consumer Services

Newsela is hiring a Senior Site Reliability Contractor to improve and automate infrastructure, monitoring, and release operations for its cloud-based education platform.

Chile Colombia Costa Rica Argentina Brazil Contract Senior Site Reliability Engineer (SRE)

Agile AWS CI/CD Datadog Docker GCP GitHub Actions JIRA MySQL Neo4j PostgreSQL Prefect Python Redis SQL Terraform

1 hour, 26 minutes ago

Apply

1 hour, 26 minutes ago

Principal Site Reliability Engineer

Zscaler 1K-5K Internet Software & Services

Zscaler is hiring a Principal Site Reliability Engineer to join its Infrastructure Services and Architecture team, owning cloud and infrastructure reliability for customer-facing systems in a hybrid or remote role.

United States Full-time Lead Site Reliability Engineer (SRE)

$192k-$275k

Agile Ansible CI/CD Git Go HashiCorp Vault Kubernetes Linux OpenID Connect Python Terraform

1 hour, 56 minutes ago

Apply

1 hour, 56 minutes ago

Senior Site Reliability Engineer

OfficeSpace Software 251-1K Internet Software & Services

OfficeSpace Software is hiring a Senior Site Reliability Engineer to own the performance, reliability, and cost efficiency of its production platform at scale while helping modernize operations with AI-assisted reliability engineering.

United States Full-time Senior Site Reliability Engineer (SRE)

Ansible Apache Argo CD CI/CD Datadog GitOps Grafana Kubernetes Linux MariaDB Microservices MySQL Nginx PostgreSQL Prometheus Puppet Python Redis Ruby Ruby on Rails Sidekiq Terraform

3 hours, 41 minutes ago

Apply

3 hours, 41 minutes ago

Tinybird

Tags

Links

Site Reliability Engineer

Tinybird

Description

Requirements

Benefits

Similar Roles

SRE / Platform Reliability Architect

Contract: Senior Site Reliability Engineer

Principal Site Reliability Engineer

Senior Site Reliability Engineer

You're on a roll! Sign up now to keep applying.