Site Reliability Engineer (SRE) Manager

2 hours, 56 minutes ago
Full-time
Lead
DevOps and Infrastructure
Leadtech

Leadtech

Leadtech is a Barcelona-based Online Innovation Technology company that has rapidly grown since 2007, becoming an industry leader in online project management with a global team of over 570 professionals.

IT Services
251-1K
Founded 2009

Description

  • Lead a team of Site Reliability Engineers and support their day-to-day execution.
  • Oversee the architecture, scalability, maintenance, security, and efficiency of cloud infrastructure.
  • Contribute hands-on as a player-coach to engineering and operational tasks.
  • Define and implement SRE practices across projects, including CI/CD, Infrastructure as Code, automated failovers, chaos engineering, and blameless post-mortems.
  • Establish and track reliability metrics such as SLIs, SLOs, SLAs, error budgets, MTTR, and MTTD.
  • Collaborate with Product and Engineering teams to align the infrastructure roadmap with product iterations.
  • Assess the technical feasibility of new projects and features and provide estimates, capacity planning, and architectural recommendations.
  • Promote teamwork and shared ownership between SRE and software development teams.
  • Champion a culture of reliability across the broader engineering organization.

Requirements

  • Previous experience in Site Reliability Engineering, systems engineering, or software engineering with an infrastructure focus.
  • Experience supporting multiple products using cloud services such as Google Cloud and AWS.
  • Experience implementing SRE best practices, including Infrastructure as Code, observability, automation, and incident management.
  • Experience with AI, such as AI-driven operations (AIOps) or supporting AI-based infrastructure.
  • Proven leadership of SRE, DevOps, Platform, or Infrastructure teams focused on reliability and scalability.
  • Previous experience managing people, delivery, system quality, and operational processes.
  • Strong ability to promote collaboration between SRE and software development teams.
  • Experience working in agile and dynamic environments.
  • Excellent communication and leadership skills.
  • Fluent English required; Spanish is a nice to have.

Benefits

  • Competitive salary and full-time permanent contract.
  • Flexible work arrangement with full remote or Barcelona office options.
  • Flexible schedule with flextime, free Friday afternoons, and a 7-hour workday on Fridays.
  • 35-hour workweek in July and August.
  • 25 days of vacation plus your birthday off, with no blackout days.
  • Top-tier private health insurance, including dental and psychological services.
  • Annual budget for external learning and personalized internal training.
  • Office perks in Barcelona including free coffee, fresh fruit, snacks, a game room, and a rooftop terrace.
  • Ticket restaurant and nursery vouchers paid from gross salary.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer

Zeta Global 1K-5K Media

Zeta Global is hiring a Senior Site Reliability Engineer to help build and operate scalable observability and reliability systems for high-throughput distributed services processing millions of transactions daily.

Argo CD AWS Docker GitOps Go Grafana Honeycomb Jenkins Kubernetes Microservices OpenTelemetry Prometheus Python Terraform
11 minutes ago

Senior SRE Engineer / DevOps

Margo Bank Professional Services

Senior SRE Engineer / DevOps position at a consulting team in Warsaw focused on developing an internal developer platform and establishing CI/CD standards across multiple teams.

Bash CI/CD DevSecOps Git Kubernetes Python
11 minutes ago

Senior Site Reliability Engineer (SRE)

KOMOJU Internet Software & Services

KOMOJU is hiring a Site Reliability Engineer to own the reliability, performance, and developer experience of its cloud-based payment platform supporting merchants across cross-border integrations.

AWS CI/CD CircleCI Datadog GitHub Actions Go Jenkins Python Ruby Ruby on Rails Shopify TCP/IP Terraform
26 minutes ago

DevOps & Site Reliability Engineer

Oowlish 51-250 Internet Software & Services

Oowlish is hiring a DevOps & Site Reliability Engineer to support an AI-focused SaaS startup by maintaining, optimizing, and scaling the infrastructure behind its platform for high availability, performance, and reliability.

AWS Azure Azure Pipelines Bash CI/CD CircleCI Datadog Docker GCP Grafana Helm Jenkins Kubernetes New Relic Prometheus
41 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers