Kaseya

Kaseya

Kaseya provides integrated IT management and security solutions for MSPs and SMBs, enabling centralized IT operations, remote management, cybersecurity, and automation.

IT Services
1K-5K
Founded 2000
$567M raised

Description

  • Set, monitor, and enforce SLOs, SLIs, and error budgets to maintain service reliability.
  • Lead incident response, troubleshooting, and blameless postmortems that drive permanent fixes.
  • Build and maintain automated deployment, configuration management, and infrastructure provisioning using Infrastructure as Code.
  • Manage cloud and hybrid infrastructure with Terraform or CloudFormation, balancing cost, scalability, and resilience.
  • Improve observability through proactive monitoring, alerting, and dashboards that surface issues early.
  • Partner with development teams to embed reliability into the SDLC, including deployment automation, capacity planning, and chaos engineering.
  • Reduce operational toil through automation and self-healing systems.
  • Support containerized and serverless workloads to keep production systems highly available and fault tolerant.
  • Stay current on SRE, cloud, and observability practices and bring improvements back to the team.

Requirements

  • 4 to 5 years of AWS production experience.
  • Experience owning infrastructure as code with Terraform or CloudFormation, including state management.
  • AWS ECS production experience, or a strong Kubernetes background with willingness to ramp up.
  • Active on-call rotation experience, including leading incidents and writing postmortems.
  • Working fluency with SLOs, SLIs, and error budgets in production.
  • Kubernetes production experience preferred.
  • Experience with observability tools such as Datadog, Dynatrace, CloudWatch, or Elasticsearch/Kibana preferred.
  • Experience with chaos engineering preferred.
  • Experience with AWS Lambda or other serverless workloads preferred.
  • Experience with Ansible, Chef, or Puppet preferred.
  • DevSecOps experience, including vulnerability scanning, secrets management, SOC2, or ISO 27001, preferred.
  • Production database support experience with RDS, PostgreSQL, or MySQL preferred.
  • Open source contributions or a public technical portfolio preferred.

Benefits

  • Annual base salary of CAD $115,000 to CAD $130,000.
  • Final offer considered based on experience, skills, and internal equity.
  • Equal employment opportunity across all protected characteristics.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff Operations Engineer

Mozilla 251-1K Internet Software & Services

Mozilla is hiring a Staff Operations Engineer to lead the design, reliability, and evolution of hybrid-cloud and workplace infrastructure across teams.

Ansible DNS Linux Puppet Python TCP/IP Unix
3 hours, 38 minutes ago

Principal Site Reliability Engineer (SRE)

Symmetrio Professional Services

Symmetrio is recruiting a Principal Site Reliability Engineer for a rapidly growing healthcare technology company to own the reliability, scalability, security, and performance of a mission-critical SaaS platform used by healthcare providers across the United States.

Active Directory AWS CI/CD Datadog Django Grafana Kubernetes Python Terraform Windows Server
3 hours, 53 minutes ago

Performance Test Engineer Lead

PartnerOne 51-250 Media

An enterprise performance engineering role at a cloud-focused organization, responsible for validating the scalability, stability, and production readiness of distributed systems across Azure and hybrid environments.

Azure CI/CD Kubernetes PowerShell
4 hours, 8 minutes ago

Site Reliability Engineer

MLabs 11-50 Internet Software & Services

Remote UK-hours Site Reliability Engineering role at a financial technology company, focused on automating and operating the infrastructure that supports global integration services for financial institutions.

Active Directory Ansible AWS CI/CD GCP OAuth PostgreSQL SAML
4 hours, 23 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers