Tinybird

Tinybird

Tinybird is a real-time analytics platform that enables data teams and developers to build low-latency APIs in minutes using SQL. It ingests millions of rows per second and serves high-concurrency analytical queries, helping businesses turn data into r...

IT Services
11-50
Founded 2019
$40M raised

Description

  • Design, build, and evolve a self-service platform that automates capacity decisions and autoscaling for a large-scale distributed system.
  • Operate and optimize infrastructure and software to ensure high availability, reliability, and elasticity as the customer base grows.
  • Work closely with product and backend teams to design system architecture, optimize resource usage, and make the platform more autonomous.
  • Participate in on-call rotations to investigate incidents, understand client-facing issues, and improve the on-call experience and disaster recovery tooling.
  • Tune and extract maximum performance from ClickHouse and other core data services, including understanding their internals when needed.
  • Design, operate, and tune production-grade Kubernetes clusters, including writing custom controllers/operators and configuring autoscaling mechanisms (e.g., KEDA, Karpenter).
  • Improve observability from low-level resource usage to high-level service metrics and implement better monitoring and alerting.
  • Automate provisioning and deployments, and collaborate on infrastructure-as-code and config management workflows (Terraform, Ansible).
  • Write maintainable software, document key insights and solutions, and communicate effectively with the team both asynchronously and in direct interactions.

Requirements

  • Proven experience designing, building, and running distributed cloud architectures and large-scale web-based applications.
  • Strong programming skills with willingness to dive into codebases and native components (experience with Python and C++ preferred).
  • Deep expertise in Kubernetes, including designing/operating clusters, custom controllers/operators, and autoscaling tuning (KEDA, Karpenter).
  • Experience with ClickHouse or deploying database systems at scale is a strong plus and ability to reason about database internals for performance tuning.
  • Proficiency working with AWS and GCP cloud providers.
  • Familiarity with Linux-based production environments.
  • Experience or familiarity with OpenResty, Varnish, Redis, Terraform, and Ansible is beneficial.
  • Ability to think in systems terms, anticipate edge cases and failure modes, and take ownership of reliability and incident resolution.
  • Willingness to participate in on-call rotations and be located in EU timezones.

Benefits

  • Salary range €62,000 - €109,000 per year plus a stock options grant.
  • 22 days of holiday per year, plus your birthday and public holidays.
  • Remote-first work with the freedom to work from wherever suits you best.
  • Up to €2,400 to help set up your home workspace.
  • Access to company offices in Madrid and New York City when needed.
  • Early-stage company impact and transparency into company decisions and roadmaps.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

SRE / Platform Reliability Architect

NEORIS 5K-10K Internet Software & Services

EPAM NEORIS is seeking an SRE/Platform Reliability Architect to lead platform reliability and resiliency design, incident response, and cross-functional alignment for digital transformation initiatives.

CI/CD Grafana Kubernetes OpenTelemetry Prometheus Terraform
56 minutes ago

Contract: Senior Site Reliability Engineer

Newsela 251-1K Diversified Consumer Services

Newsela is hiring a Senior Site Reliability Contractor to improve and automate infrastructure, monitoring, and release operations for its cloud-based education platform.

Agile AWS CI/CD Datadog Docker GCP GitHub Actions JIRA MySQL Neo4j PostgreSQL Prefect Python Redis SQL Terraform
1 hour, 26 minutes ago

Principal Site Reliability Engineer

Zscaler 1K-5K Internet Software & Services

Zscaler is hiring a Principal Site Reliability Engineer to join its Infrastructure Services and Architecture team, owning cloud and infrastructure reliability for customer-facing systems in a hybrid or remote role.

Agile Ansible CI/CD Git Go HashiCorp Vault Kubernetes Linux OpenID Connect Python Terraform
1 hour, 56 minutes ago

Senior Site Reliability Engineer

OfficeSpace Software 251-1K Internet Software & Services

OfficeSpace Software is hiring a Senior Site Reliability Engineer to own the performance, reliability, and cost efficiency of its production platform at scale while helping modernize operations with AI-assisted reliability engineering.

Ansible Apache Argo CD CI/CD Datadog GitOps Grafana Kubernetes Linux MariaDB Microservices MySQL Nginx PostgreSQL Prometheus Puppet Python Redis Ruby Ruby on Rails Sidekiq Terraform
3 hours, 41 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers