CI&T

CI&T is a global digital technology agency empowering agile growth for leading companies through advanced technologies with a team of 2000 experts worldwide.

Internet Software & Services

Information Technology

5K-10K (5564)

Founded 1995

164 open positions

Links

View All Jobs

[Job - 29712] Senior Devops / SRE

9 hours, 23 minutes ago

Brazil

Full-time

Senior

Site Reliability Engineer (SRE)

DevOps and Infrastructure

AWS AWS CDK Azure C# CI/CD Datadog Docker Gatling GitHub Actions GitLab CI Grafana Jaeger K6 Kubernetes .NET Next.js OpenTelemetry Prometheus Pulumi Terraform TypeScript WAF

Apply Now

CI&T

CI&T is a global digital technology agency empowering agile growth for leading companies through advanced technologies with a team of 2000 experts worldwide.

Internet Software & Services

5K-10K

Founded 1995

View All Jobs 164

Description

Design, implement, and evolve CI/CD pipelines for .NET and Next.js applications to support fast, secure, and traceable releases.
Manage and improve container infrastructure with Docker and Kubernetes, including deployments, autoscaling, and resource management.
Implement and maintain the product observability stack, including metrics, logs, traces, and operational dashboards.
Build and maintain SRE dashboards covering SLIs, SLOs, and error budgets.
Configure proactive alerts and runbooks for incident response.
Collaborate with developers on code instrumentation standards, including structured logs and distributed traces.
Work with AWS and infrastructure security practices to support a reliable production environment.
Support QA in running automated tests in ephemeral, container-isolated environments.
Contribute to engineering culture through runbooks, post-mortems, and continuous process improvement.
Investigate and resolve incidents with urgency and clear communication across technical and business teams.

Requirements

Solid experience with CI/CD tools such as GitHub Actions, GitLab CI, Azure DevOps, or equivalent.
Strong hands-on experience with Docker and Kubernetes in production, including deployments, services, ingress, HPA, and namespaces.
Experience with AWS, especially EKS, ECR, Secrets Manager, IAM, and WAF.
Knowledge of observability tools such as Datadog, Grafana, Prometheus, OpenTelemetry, or similar.
Experience building operational dashboards focused on availability, latency, errors, and saturation using RED, USE, or Four Golden Signals models.
Familiarity with infrastructure as code tools such as Terraform, Pulumi, or CDK.
Knowledge of database monitoring for connection health, slow queries, and locks.
Understanding of infrastructure security practices such as secrets rotation, least privilege, and network policies.
Ability to read and understand .NET/C#, TypeScript, and Next.js code to support instrumentation and troubleshooting.
Preferred experience with service mesh technologies such as Istio or Linkerd.
Preferred knowledge of distributed tracing tools such as Jaeger, Tempo, or Datadog APM.
Preferred experience with incident management and creating runbooks and operational playbooks.
Preferred experience with performance and load testing tools such as k6 or Gatling integrated into CI/CD pipelines.
Preferred experience working in multi-tenant environments and isolating observability by client.

Benefits

Health and dental insurance.
Meal and food allowance.
Childcare assistance.
Extended parental leave.
Gym and wellness partnerships through Wellhub (Gympass) and TotalPass.
Profit sharing (PLR).
Life insurance.
Continuous learning platform (CI&T University) and partnerships with online course and language-learning platforms.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Lead Site Reliability Engineer - 10929

Coupa Software 1K-5K Internet Software & Services

Coupa is hiring a Lead Site Reliability Engineer to support and evolve its cloud and GenAI platform infrastructure, with a focus on reliability, automation, and scalable operations.

India Full-time Lead DevOps Engineer Site Reliability Engineer (SRE)

AWS Azure Bash Chef DNS GCP Git GitHub Actions Helm Kubernetes Linux LLM MySQL New Relic PagerDuty Python SageMaker Terraform

3 hours, 58 minutes ago

Apply

3 hours, 58 minutes ago

Site Reliability Engineer (Remote)

Libertex Group 251-1K Capital Markets

Libertex Group is hiring an SRE Engineer to support and improve the reliability, performance, and availability of its large-scale production systems for its online trading platform.

Anywhere Full-time Mid Level Site Reliability Engineer (SRE)

Ansible Apache Airflow AWS Azure Bash CDN CI/CD DNS Docker GCP GitLab Grafana HTTP Jenkins Kubernetes PowerShell Prometheus Python SQL SQL Server

4 hours, 20 minutes ago

Apply

4 hours, 20 minutes ago

Senior AIOps Engineer, Incident Response [Remote-US]

Quanata 201-500 information technology & services

Quanata is hiring an experienced production operations and reliability leader to oversee production health, incident response, and operational support for its AI-driven insurance technology platform.

United States Full-time Senior AI Engineer Site Reliability Engineer (SRE)

$215k-$280k

AWS Confluence JIRA

7 hours, 15 minutes ago

Apply

7 hours, 15 minutes ago

Senior Site Reliability Engineer

Amwell 1K-5K Diversified Telecommunication Services

Amwell is hiring a Senior Systems Engineer to support and automate infrastructure across its data center and cloud environments for telehealth services.

United States Full-time Senior Site Reliability Engineer (SRE)

$129k-$140k

Active Directory Ansible AWS Azure Bash Elasticsearch ELK Stack GCP Kibana Linux Logstash PowerShell Puppet Python TCP/IP Terraform

10 hours, 28 minutes ago

Apply

10 hours, 28 minutes ago

CI&T

Tags

Links

[Job - 29712] Senior Devops / SRE

CI&T

Description

Requirements

Benefits

Similar Roles

Lead Site Reliability Engineer - 10929

Site Reliability Engineer (Remote)

Senior AIOps Engineer, Incident Response [Remote-US]

Senior Site Reliability Engineer

You're on a roll! Sign up now to keep applying.