66degrees

66degrees

66degrees: Google Cloud Premier Partner shaping the Future of Work with AI and data solutions.

IT Services
251-1K

Description

  • Ensure near-zero downtime through monitoring, alerting, self-healing automation, and continuous improvement.
  • Build highly automated, available, and scalable systems using software and infrastructure principles.
  • Advise clients on DevOps and SRE practices, including deployment pipelines, high availability, service reliability, technical debt, and operational toil.
  • Take a proactive approach to client workloads by anticipating failures, automating tasks, and maintaining availability and customer experience.
  • Work closely with clients, teammates, and Google engineers to investigate and resolve infrastructure issues.
  • Contribute to documentation, open-source efforts, and operational improvements.
  • Support live services running at scale across varied customer environments.

Requirements

  • 1-2 years of cloud and infrastructure experience with Linux, Windows, Kubernetes, databases, and networking services.
  • 1+ years of Google Cloud experience; related certifications are strongly preferred but not required.
  • Proficiency with Python is required; experience with other programming languages is a plus.
  • Strong provisioning and configuration skills using Terraform.
  • Experience with 24x7x365 monitoring, incident response, and on-call support.
  • Experience troubleshooting issues across systems, network, and code.
  • Experience negotiating Error Budgets, SLIs, SLOs, and SLAs with product owners.
  • Ability to work independently and collaboratively across teams.
  • Experience working in Agile Scrum and Kanban methodologies within the SDLC.
  • Strong communication skills for a heavily customer-facing role.
  • Bachelor’s degree in computer science, electrical engineering, or equivalent required.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Database Reliability Engineer (DBRE) (worldwide remote)

CloudLinux 51-250 IT Services

CloudLinux / TuxCare is hiring a Senior Database Reliability Engineer to own and improve the reliability, automation, and incident response of its production PostgreSQL and broader database infrastructure.

Ansible ClickHouse DNS GitLab Grafana JIRA Linux MongoDB OpsGenie PostgreSQL Redis Terraform TLS
49 minutes ago

Senior Site Reliability Engineer - AWS

Filevine 251-1K Specialized Consumer Services

Filevine is hiring a Senior Site Reliability Engineer to embed with cross-functional teams and improve the reliability, automation, and scalability of its AWS-based legal technology platform.

AWS Bash CI/CD EC2 Kubernetes PowerShell Python
13 hours, 28 minutes ago

Staff Site Reliability Engineer

Puck 1-10 Internet Software & Services

Domino is hiring a senior Site Reliability Engineer to build AI-assisted reliability systems and strengthen the operational resilience of its cloud-based data science platform.

Go Kubernetes Linux LLM Python
14 hours, 30 minutes ago

Senior Site Reliability Engineer

GoReel 51-200 Software Development

Senior Site Reliability Engineer needed to support the reliability, scalability, performance, and stability of systems and applications for an international iGaming company.

Argo CD AWS CI/CD Confluence Debian Docker EC2 Elasticsearch GitHub GitHub Actions GitLab Grafana Helm Jenkins JIRA Kibana Kubernetes OpsGenie Prometheus
16 hours, 25 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers