SRE / Platform Reliability Architect

1 hour, 1 minute ago
Full-time
Lead
DevOps and Infrastructure
NEORIS

NEORIS

NEORIS is a leading global IT consulting company specializing in nearshore outsourcing services and SAP solutions, empowering companies to innovate through digital transformation.

Internet Software & Services
5K-10K
Founded 2000

Description

  • Define the organizational reliability strategy, including SLIs, SLOs, and error budgets.
  • Lead Sev-1 incidents and coordinate response efforts and communication.
  • Design highly available, scalable, and resilient architectures.
  • Implement advanced observability and automation practices.
  • Optimize cloud cost, performance, and resiliency.
  • Define and review CI/CD, security, and operations standards.
  • Mentor and guide mid-level and junior technical team members.
  • Align technical reliability initiatives with business objectives.

Requirements

  • 6+ years of experience in SRE, DevOps, or Architecture roles.
  • Expert-level Kubernetes experience, including operations and advanced troubleshooting.
  • Deep knowledge of cloud environments, including IAM, networking, security, and resiliency.
  • Advanced infrastructure as code experience with Terraform, including modules and pipelines.
  • Experience with distributed systems and resiliency patterns such as circuit breakers, retries, and autoscaling.
  • Hands-on experience with observability tools such as Prometheus, Grafana, ELK, and OpenTelemetry.
  • Strong architectural design skills and the ability to make critical technical decisions.

Benefits

  • Private health insurance (medicina prepagada).
  • 3 NEORIS Days as paid time off.
  • Annual performance bonus.
  • Vacation bonus.
  • Access to training and development platforms.
  • Connectivity subsidy.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Contract: Senior Site Reliability Engineer

Newsela 251-1K Diversified Consumer Services

Newsela is hiring a Senior Site Reliability Contractor to improve and automate infrastructure, monitoring, and release operations for its cloud-based education platform.

Agile AWS CI/CD Datadog Docker GCP GitHub Actions JIRA MySQL Neo4j PostgreSQL Prefect Python Redis SQL Terraform
1 hour, 31 minutes ago

Principal Site Reliability Engineer

Zscaler 1K-5K Internet Software & Services

Zscaler is hiring a Principal Site Reliability Engineer to join its Infrastructure Services and Architecture team, owning cloud and infrastructure reliability for customer-facing systems in a hybrid or remote role.

Agile Ansible CI/CD Git Go HashiCorp Vault Kubernetes Linux OpenID Connect Python Terraform
2 hours, 1 minute ago

Senior Site Reliability Engineer

OfficeSpace Software 251-1K Internet Software & Services

OfficeSpace Software is hiring a Senior Site Reliability Engineer to own the performance, reliability, and cost efficiency of its production platform at scale while helping modernize operations with AI-assisted reliability engineering.

Ansible Apache Argo CD CI/CD Datadog GitOps Grafana Kubernetes Linux MariaDB Microservices MySQL Nginx PostgreSQL Prometheus Puppet Python Redis Ruby Ruby on Rails Sidekiq Terraform
3 hours, 46 minutes ago

Manager, Software Engineering (Resilience Engineering)

Affirm 1K-5K Diversified Financial Services

Affirm is hiring an Engineering Manager to lead its Resilience Engineering team in building production load testing and chaos engineering capabilities that improve the safety and reliability of its production systems.

AWS Java Kotlin Kubernetes Python
4 hours, 23 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers