66degrees

66degrees

66degrees: Google Cloud Premier Partner shaping the Future of Work with AI and data solutions.

IT Services
251-1K

Description

  • Ensure near-zero downtime through monitoring, alerting, self-healing automation, and continuous improvement.
  • Build highly automated, available, and scalable systems using software and infrastructure principles.
  • Advise clients on DevOps and SRE practices, including deployment pipelines, high availability, service reliability, technical debt, and operational toil.
  • Take a proactive approach to client workloads by anticipating failures, automating tasks, ensuring availability, and supporting customer experience.
  • Work with clients, internal teams, and Google engineers to investigate and resolve infrastructure issues.
  • Manage a Jira queue of inbound requests across multiple clients while balancing and prioritizing work.
  • Contribute to documentation, open-source efforts, and operational improvements.
  • Support deployment and optimization of cloud workloads using Google Cloud technologies and related tooling.

Requirements

  • 4+ years of cloud and infrastructure experience, including Linux, Windows, Kubernetes, databases, and networking services.
  • 2+ years of full-time Google Cloud experience preferred.
  • Proficiency with Python required.
  • Strong provisioning and configuration skills using Terraform.
  • Experience troubleshooting issues across systems, networks, and code.
  • Experience with 24x7x365 monitoring, incident response, and on-call support preferred.
  • Experience determining and negotiating error budgets, SLIs, SLOs, and SLAs with product owners.
  • Experience working in Agile Scrum and Kanban methodologies in the SDLC.
  • Ability to work independently and collaboratively across teams.
  • Strong communication skills in a heavily customer-facing role.
  • Bachelor’s degree in Computer Science, Computer Engineering, or related field, or equivalent work experience.
  • Microsoft Server and SQL Server experience is a plus but not required.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff Operations Engineer

Mozilla 251-1K Internet Software & Services

Mozilla is hiring a Staff Operations Engineer to lead the design, reliability, and evolution of hybrid-cloud and workplace infrastructure across teams.

Ansible DNS Linux Puppet Python TCP/IP Unix
3 hours, 38 minutes ago

Principal Site Reliability Engineer (SRE)

Symmetrio Professional Services

Symmetrio is recruiting a Principal Site Reliability Engineer for a rapidly growing healthcare technology company to own the reliability, scalability, security, and performance of a mission-critical SaaS platform used by healthcare providers across the United States.

Active Directory AWS CI/CD Datadog Django Grafana Kubernetes Python Terraform Windows Server
3 hours, 53 minutes ago

Performance Test Engineer Lead

PartnerOne 51-250 Media

An enterprise performance engineering role at a cloud-focused organization, responsible for validating the scalability, stability, and production readiness of distributed systems across Azure and hybrid environments.

Azure CI/CD Kubernetes PowerShell
4 hours, 8 minutes ago

Site Reliability Engineer

MLabs 11-50 Internet Software & Services

Remote UK-hours Site Reliability Engineering role at a financial technology company, focused on automating and operating the infrastructure that supports global integration services for financial institutions.

Active Directory Ansible AWS CI/CD GCP OAuth PostgreSQL SAML
4 hours, 23 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers