Flip App

Flip is the employee app reshaping workplace communication by empowering every employee with a digital workspace for effective communication and workflow management.

Internet Software & Services

Information Technology

51-250 (180)

Founded 2018

8 open positions

Links

View All Jobs

Senior Site Reliability Engineer (m/f/d)

1 month, 3 weeks ago

Europe, Germany

Full-time

Senior

Site Reliability Engineer (SRE)

DevOps and Infrastructure

API Gateway Argo CD Azure CI/CD GitOps Go Grafana Kubernetes PostgreSQL Prometheus Pulumi Python Terraform

Apply Now

Flip App

Flip is the employee app reshaping workplace communication by empowering every employee with a digital workspace for effective communication and workflow management.

Internet Software & Services

51-250

Founded 2018

View All Jobs 8

Description

Own critical reliability domains end-to-end within the Platform Squad.
Drive technical direction and architectural decisions for the platform.
Help evolve cloud infrastructure on Azure and Kubernetes for high throughput and high availability.
Define and improve the platform’s resilience strategy, including scaling, zero-downtime deployments, rollback mechanisms, and disaster recovery.
Improve the observability stack built around Loki, Grafana, Tempo, and Mimir.
Reduce infrastructure toil by making the IaC platform more self-service for engineering teams.
Lead platform-related major incidents and drive blameless post-mortems.
Coach teammates, run RFCs and design reviews, and mentor engineers within the squad.
Partner with the squad to shape the platform roadmap and direction.

Requirements

5+ years of hands-on experience as an SRE, Platform Engineer, DevOps Engineer, Infrastructure Engineer, Cloud Engineer, or Backend Engineer with a strong infrastructure focus.
Proven track record of building and operating high-throughput, highly available production systems.
Deep production-level experience with Kubernetes on any hyperscaler.
Strong experience with modern observability stacks such as Prometheus, Mimir, VictoriaMetrics, Dash0, Loki, or ELK, plus a clear point of view on SLIs, SLOs, and error budgets.
Solid software development skills in Go, strongly preferred because the IaC runs on Pulumi in Go, or Python.
Hands-on experience with Infrastructure as Code tools such as Pulumi, OpenTofu, or Terraform, plus GitOps tools such as ArgoCD and CI/CD pipeline design.
Demonstrated ability to lead complex infrastructure initiatives from design to production, including writing RFCs and driving architecture decisions.
Experience mentoring engineers and raising the technical bar within a team.
Comfortable owning major incidents end-to-end and turning learnings into systemic change.
Strong communication skills and business-fluent English.
Willingness to participate in on-call rotations.
Preferred: experience rolling out production-ready API gateways with Gateway API such as Envoy Gateway.
Preferred: experience operating multi-cluster service meshes such as Cilium, Linkerd, or Istio.
Preferred: experience deploying and maintaining Kubernetes Operators such as Strimzi or CNPG.
Preferred: experience operating highly available PostgreSQL in production.

Benefits

Remote-first work with flexibility to work from home.
Occasional team events, workshops, or meetings in the Berlin or Stuttgart offices with plenty of notice.
E-Gym-Wellpass membership covered by the company.
Job bike leasing.
Regular team events and culture days.
Option to work abroad within the European Union.
Relaxed working atmosphere with highly motivated and committed colleagues.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer

VantageScore 11-50 Banks

Site Reliability Engineer at a growing engineering team, focused on DevSecOps for maintaining the reliability, security, and compliance of cloud infrastructure, APIs, and software supply chains.

United States Full-time Senior Site Reliability Engineer (SRE)

$150k-$150k

Agile AWS AWS CDK Bash CI/CD CloudFormation CodePipeline Datadog DevSecOps Docker EC2 GitHub Actions Grafana HashiCorp Vault Kong Kubernetes Microservices Python REST API Scrum Terraform

22 hours, 55 minutes ago

Apply

22 hours, 55 minutes ago

Application Site Reliability Engineer (SRE)

CXM Direct 51-250 Capital Markets

Application Site Reliability Engineer at a trading technology company, responsible for keeping .NET/C# Windows-based trading and back-office services highly reliable, observable, and resilient.

Chile Colombia Peru Uruguay Mexico Argentina Full-time Mid Level Site Reliability Engineer (SRE)

AWS Bash C# CI/CD Docker Grafana Kubernetes Microservices .NET OpenTelemetry PowerShell Prometheus Python Terraform Windows Server

23 hours, 10 minutes ago

Apply

23 hours, 10 minutes ago

Customer Reliability Engineer

iPiD 11-50 Internet Software & Services

iPiD is hiring a Customer Reliability Engineer to own production reliability, customer deployments, and operational excellence for its global KYP verification platform.

Romania Full-time Senior Customer Success Site Reliability Engineer (SRE)

Ansible CI/CD GitOps Helm Kubernetes Linux Microservices Terraform

23 hours, 25 minutes ago

Apply

23 hours, 25 minutes ago

Site Reliability Engineer

CSC Generation 251-1K Internet Software & Services

Backcountry is hiring a Site Reliability Engineer in Costa Rica to keep its ecommerce platform reliable, scalable, and observable across a multi-cloud environment.

Costa Rica Full-time Mid Level Site Reliability Engineer (SRE)

Ansible Argo CD AWS AWS CDK Bash CI/CD Docker GCP GitOps Grafana Helm Kubernetes Linux Node.js OpenSearch Prometheus Python Terraform TypeScript

1 day, 22 hours ago

Apply

1 day, 22 hours ago

Flip App

Tags

Links

Senior Site Reliability Engineer (m/f/d)

Flip App

Description

Requirements

Benefits

Similar Roles

Site Reliability Engineer

Application Site Reliability Engineer (SRE)

Customer Reliability Engineer

Site Reliability Engineer

You're on a roll! Sign up now to keep applying.