HubSpot

HubSpot provides a comprehensive cloud-based CRM platform that integrates marketing, sales, service, and operations tools to help businesses attract, engage, and delight customers effectively.

Media

Consumer Discretionary

5K-10K (7433)

Founded 2006

36 open positions

Links

View All Jobs

Director, Reliability Engineering

1 month ago

Ireland

Full-time

Executive

Site Reliability Engineer (SRE)

DevOps and Infrastructure

AWS CI/CD Microservices System Design

Apply Now

HubSpot

HubSpot provides a comprehensive cloud-based CRM platform that integrates marketing, sales, service, and operations tools to help businesses attract, engage, and delight customers effectively.

Media

5K-10K

Founded 2006

View All Jobs 36

Description

Lead and develop a team of ~20 reliability engineers, fostering operational excellence, continuous learning, and career growth.
Attract, retain, and grow top SRE talent and build clear engineering career paths.
Define and drive HubSpot’s reliability roadmap, prioritizing proactive resilience and incident reduction alongside cost and performance tradeoffs.
Set and evolve company-wide SLO standards to align engineering effort with customer experience.
Lead the strategy and implementation of AI-driven operations, integrating agentic approaches for incident detection, diagnosis, mitigation, and automated runbook execution.
Design and build intelligent systems that learn from operational history to surface risks and recommend or execute mitigations while balancing automation with human judgment.
Own incident management end-to-end, including response coordination, executive communication during major incidents, and blameless post-incident reviews to drive systemic improvements.
Influence engineering culture across 100+ product teams, identify systemic platform risks, and drive cross-functional mitigation efforts and alignment with Infrastructure, Product Engineering, and Security leadership.

Requirements

10+ years of experience in software engineering, SRE, or infrastructure, with 5+ years leading teams.
Proven track record of building and scaling reliability functions in environments with significant operational complexity.
Deep technical fluency with the ability to participate credibly in architecture discussions, incident analysis, and system design.
Experience or strong interest in AIOps, agentic automation, or ML-driven observability, with curiosity and vision for AI/ML to transform operations.
Proven ability to drive cultural and process change across large engineering organizations without relying on top-down mandates.
Strong executive communication skills; comfortable leading incident bridges, presenting to leadership, and representing reliability externally.
Experience with modern cloud infrastructure (AWS preferred), observability tooling, and incident management practices.
A philosophy that balances reliability with velocity, prioritizing sustainable speed over gating.

Benefits

Flexible remote-first / hybrid work environment with regional in-person onboarding and periodic in-person events.
Support for accommodations due to disability or travel limitations during hiring and onboarding.
High-visibility, high-impact leadership role with executive access and strategic influence.
Opportunity to shape how AI transforms operational practices across the company and potentially the industry.
Work at a globally distributed company recognized for an award-winning culture and focus on employee growth and connection.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Infrastructure Engineer - Postgres

ClickHouse 51-250 IT Services

Senior SRE / Senior Infrastructure Engineer at ClickHouse responsible for owning reliability, automation, and operations for the company’s Postgres integration across AWS, GCP, and Azure to ensure scalable, secure, and dependable cloud data platform services.

India Full-time Senior Site Reliability Engineer (SRE)

AWS Azure CI/CD ClickHouse Docker GCP Go Grafana Kubernetes OpenTelemetry PostgreSQL Prometheus Terraform

1 month ago

Apply

1 month ago

Senior Field Engineer | UK | Remote

Grafana 1K-5K IT Services

Senior Field Engineering Infrastructure role at Grafana Labs responsible for maintaining and developing the pre-sales demo kit and backend infrastructure, creating technical demos and training, and enabling the Solution Engineering team to scale adoption and close deals.

United Kingdom Full-time Senior Sales Engineer Site Reliability Engineer (SRE)

$145k-$175k

AWS Azure CI/CD Datadog Elasticsearch GCP Grafana Kubernetes Prometheus Splunk Terraform

1 month ago

Apply

1 month ago

Cloud / Platform Engineer (Remote)

Alex Staff Agency 11-50 Professional Services

Cloud/Platform Engineer at a U.S.-based EdTech company operating a global, high-load digital learning platform, responsible for maintaining production reliability and operating multi-region cloud and Kubernetes infrastructure.

Azerbaijan Kazakhstan Kyrgyzstan Tajikistan Turkmenistan Uzbekistan Full-time Mid Level Site Reliability Engineer (SRE)

AWS Bash CI/CD GCP Go Kubernetes Python Terraform

1 month ago

Apply

1 month ago

Customer Reliability Engineer

Sysdig 251-1K IT Services

Customer Reliability Engineer at Sysdig (remote, flexible for Italy/Spain) delivering senior-level technical support and escalation management to ensure customers run and secure cloud/container environments reliably.

Italy Spain Full-time Senior Customer Success Site Reliability Engineer (SRE)

AWS Azure Bash Cassandra Elasticsearch GCP Kafka Kubernetes Linux PostgreSQL Python Shell Scripting

1 month ago

Apply

1 month ago

HubSpot

Tags

Links

Director, Reliability Engineering

HubSpot

Description

Requirements

Benefits

Similar Roles

Senior Infrastructure Engineer - Postgres

Senior Field Engineer | UK | Remote

Cloud / Platform Engineer (Remote)

Customer Reliability Engineer

You're on a roll! Sign up now to keep applying.