Oowlish

Top Nearshore Software Developers And Tech Squads | Oowlish Oowlish provides companies of all sizes access to the best technical talent in Brazil, making innovation more accessible and convenient than ever. Because our mission is to give every company,...

Internet Software & Services

Information Technology

51-250 (150)

Founded 2017

19 open positions

Links

View All Jobs

Senior Site Reliability Engineer (SRE)

1 hour, 16 minutes ago

Mexico, Argentina, Brazil, Colombia

Full-time

Senior

Site Reliability Engineer (SRE)

DevOps and Infrastructure

AWS CI/CD Kubernetes Microservices Terraform

Apply Now

Oowlish

Internet Software & Services

51-250

Founded 2017

View All Jobs 19

Description

Design, implement, and improve Site Reliability Engineering practices across production environments.
Define, manage, and continuously improve Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets.
Lead and participate in incident response and incident command processes.
Build and evolve observability strategies, including monitoring, logging, alerting, and distributed tracing.
Improve system reliability, availability, scalability, and operational efficiency.
Partner with engineering teams to improve application performance and production readiness.
Develop automation solutions that reduce operational overhead and improve reliability.
Participate in root cause analysis and post-incident reviews.
Drive continuous improvement initiatives based on operational insights and incident learnings.
Help establish reliability best practices across teams and services.

Requirements

5+ years of professional experience in Site Reliability Engineering, DevOps, or Production Engineering roles.
Strong understanding of Site Reliability Engineering principles and best practices.
Experience supporting and operating production systems at scale.
Strong knowledge of monitoring, observability, and reliability engineering concepts.
Experience working in cloud-based environments.
Strong troubleshooting and problem-solving skills.
Experience working with distributed systems and modern application architectures.
Proven Site Reliability Engineering experience.
Experience defining and managing SLOs, SLIs, and error budgets.
Experience leading or actively participating in Incident Command and Incident Response processes.
Experience designing and implementing observability strategies.
Hands-on experience with monitoring, logging, alerting, and distributed tracing.
Experience improving system reliability, availability, and operational excellence.
Experience supporting mission-critical production environments.
Experience with cloud platforms, with AWS preferred.
Strong automation mindset.
Experience conducting root cause analysis and postmortems.
Kubernetes experience is nice to have.
Terraform or Infrastructure as Code experience is nice to have.
CI/CD pipeline experience is nice to have.
Experience with containerized environments is nice to have.
Experience with distributed microservices architectures is nice to have.
Experience with performance engineering is nice to have.
Experience mentoring engineers on reliability practices is nice to have.
Multi-cloud experience is nice to have.
Experience working in highly regulated or high-availability environments is nice to have.

Benefits

Remote work / home office.
Competitive compensation based on experience.
Career plans with extensive growth opportunities.
International projects.
Oowlish English Program for technical and conversational English.
Oowlish Fitness with Total Pass.
Games and competitions.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Platform Database Engineer (MONGO DB)

Valtech 5K-10K Professional Services

Platform Database Engineer at a US-remote enterprise role focused on designing, operating, and optimizing MongoDB platforms across cloud-based mission-critical data environments.

United States Full-time Mid Level Database Administrator Site Reliability Engineer (SRE)

AWS Bash CI/CD EC2 GitOps Kafka Kubernetes Linux MongoDB Prometheus Python Terraform

16 minutes ago

Apply

16 minutes ago

Senior Software Engineer - Grafana Databases, Managed Services | Germany | Remote

Grafana 1K-5K IT Services

Grafana Labs is hiring a Senior Software Engineer for its Managed Services team to run and improve the production infrastructure behind Grafana Cloud’s next-generation database products.

Germany Full-time Senior Backend Engineer Site Reliability Engineer (SRE)

$105k-$125k

AWS Azure Cassandra ClickHouse GCP Go Grafana Helm Kafka Kubernetes Linux Microservices PostgreSQL Snowflake Terraform

31 minutes ago

Apply

31 minutes ago

Staff Reliability Engineer

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is hiring Reliability Engineers to support autonomous defense systems across the full product lifecycle, from early design through production and fielded operations.

United States Full-time Lead Site Reliability Engineer (SRE)

$191k-$253k

46 minutes ago

Apply

46 minutes ago

Senior Reliability Engineer

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is hiring a Reliability Engineer to support autonomous defense systems across the full product lifecycle, from concept and design through production and fielded operations.

United States Full-time Senior Site Reliability Engineer (SRE)

$143k-$191k

1 hour, 1 minute ago

Apply

1 hour, 1 minute ago

Oowlish

Tags

Links

Senior Site Reliability Engineer (SRE)

Oowlish

Description

Requirements

Benefits

Similar Roles

Platform Database Engineer (MONGO DB)

Senior Software Engineer - Grafana Databases, Managed Services | Germany | Remote

Staff Reliability Engineer

Senior Reliability Engineer

You're on a roll! Sign up now to keep applying.