Tinybird

Tinybird

Tinybird is a real-time analytics platform that enables data teams and developers to build low-latency APIs in minutes using SQL. It ingests millions of rows per second and serves high-concurrency analytical queries, helping businesses turn data into r...

IT Services
11-50
Founded 2019
$40M raised

Description

  • Design, build, and evolve a self-service platform that automates capacity decisions and autoscaling for a large-scale distributed system.
  • Operate and optimize infrastructure and software to ensure high availability, reliability, and elasticity as the customer base grows.
  • Work closely with product and backend teams to design system architecture, optimize resource usage, and make the platform more autonomous.
  • Participate in on-call rotations to investigate incidents, understand client-facing issues, and improve the on-call experience and disaster recovery tooling.
  • Tune and extract maximum performance from ClickHouse and other core data services, including understanding their internals when needed.
  • Design, operate, and tune production-grade Kubernetes clusters, including writing custom controllers/operators and configuring autoscaling mechanisms (e.g., KEDA, Karpenter).
  • Improve observability from low-level resource usage to high-level service metrics and implement better monitoring and alerting.
  • Automate provisioning and deployments, and collaborate on infrastructure-as-code and config management workflows (Terraform, Ansible).
  • Write maintainable software, document key insights and solutions, and communicate effectively with the team both asynchronously and in direct interactions.

Requirements

  • Proven experience designing, building, and running distributed cloud architectures and large-scale web-based applications.
  • Strong programming skills with willingness to dive into codebases and native components (experience with Python and C++ preferred).
  • Deep expertise in Kubernetes, including designing/operating clusters, custom controllers/operators, and autoscaling tuning (KEDA, Karpenter).
  • Experience with ClickHouse or deploying database systems at scale is a strong plus and ability to reason about database internals for performance tuning.
  • Proficiency working with AWS and GCP cloud providers.
  • Familiarity with Linux-based production environments.
  • Experience or familiarity with OpenResty, Varnish, Redis, Terraform, and Ansible is beneficial.
  • Ability to think in systems terms, anticipate edge cases and failure modes, and take ownership of reliability and incident resolution.
  • Willingness to participate in on-call rotations and be located in EU timezones.

Benefits

  • Salary range €62,000 - €109,000 per year plus a stock options grant.
  • 22 days of holiday per year, plus your birthday and public holidays.
  • Remote-first work with the freedom to work from wherever suits you best.
  • Up to €2,400 to help set up your home workspace.
  • Access to company offices in Madrid and New York City when needed.
  • Early-stage company impact and transparency into company decisions and roadmaps.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Database Reliability Engineer

Sporty Group 51-250 Media

Sporty is seeking a Database Reliability Engineer to own and improve its database infrastructure supporting multiple platforms and international expansion.

Ansible Argo CD Elasticsearch GitHub Actions Go Grafana Helm Jenkins Kubernetes MongoDB MySQL PostgreSQL Prometheus Python RabbitMQ Terraform
8 hours, 2 minutes ago

Senior Site Reliability Engineer

Moniepoint 1K-5K Diversified Financial Services

Moniepoint is hiring an experienced Site Reliability Engineer to improve the reliability, scalability, and observability of its highly distributed financial platform serving emerging markets.

AWS Azure Datadog GCP Go Java Kafka Kubernetes Microservices MySQL New Relic OpenTelemetry PostgreSQL Prometheus Python RabbitMQ Rust
8 hours, 47 minutes ago

Senior Site Reliability Engineer, Identity Platform

Coinbase 1K-5K Capital Markets

Coinbase is hiring an experienced Site Reliability Engineer to build and scale identity and access management tooling for its IT Operations Corporate Engineering team supporting cloud-based, security-first systems.

Ansible AWS Azure C# CI/CD Docker GCP Go Java Kubernetes Python Ruby Secrets Management Terraform
9 hours, 17 minutes ago

Database Reliability Engineer - Core Team

ClickHouse 51-250 IT Services

ClickHouse is hiring a Site Reliability Engineering team member for ClickHouse Core to improve the reliability, availability, scalability, and performance of ClickHouse Cloud for customers worldwide.

AWS Azure C++ ClickHouse GCP Python SQL
9 hours, 47 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers