Site Reliability Engineer - Canada Wide - Remote

5 hours, 1 minute ago
Senior
DevOps and Infrastructure
Newton

Newton

Newton provides a user-friendly platform for Canadians to buy and sell Bitcoin, Ethereum, and over 70 other cryptocurrencies, offering competitive trading fees and a seamless trading experience.

Capital Markets
51-250
Founded 2018
$35M raised

Description

  • Implement improvements to infrastructure reliability, fault tolerance, scalability, and performance.
  • Manage incidents and coordinate the appropriate teams during operational issues.
  • Respond to automated alerts and support critical services through an on-call rotation.
  • Define and maintain SLIs, SLOs, SLAs, and error budgets to guide reliability decisions.
  • Improve observability across systems through metrics, logs, and tracing.
  • Reduce production issue detection, troubleshooting, and resolution time.
  • Enhance monitoring, alerting, dashboards, tracing, and runbooks for critical services.
  • Lead postmortems and follow-up actions to prevent repeat incidents.
  • Automate manual operational practices and help build better incident response processes.
  • Work closely with engineering teams to improve system design and operational excellence.

Requirements

  • Experience designing and operating scalable, reliable systems in AWS or a similar cloud environment.
  • Experience handling on-call shifts for critical systems.
  • Experience with chaos engineering tools or practices, such as Gremlin.
  • Ability to debug live production systems.
  • Experience writing and deploying code with zero downtime.
  • Experience scripting or developing with Linux Shell, Python, JavaScript, Java, or similar languages.
  • Self-starter with the ability to take initiative in ambiguous environments, preferably in a startup setting.

Benefits

  • Remote work across Canada.
  • Inclusive work environment that welcomes candidates from all backgrounds and perspectives.
  • Reasonable accommodations provided during the application process for candidates who need assistance.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer - India

Zimperium 251-1K Professional Services

Zimperium is hiring a Senior Site Reliability Engineer in India to improve the reliability, automation, and scalability of its mobile security production systems and applications.

CI/CD Datadog Docker Java Kubernetes Linux Python Unix
5 hours, 16 minutes ago

Senior Site Reliability Engineer

Block 10K-50K Capital Markets

Block is hiring an SRE to improve the reliability of its platform and critical infrastructure for Tier 0 services, with a focus on safe, scalable operations and system-wide incident reduction.

AWS CI/CD Datadog DynamoDB Envoy gRPC HTTP Java JSON Kotlin Kubernetes MySQL Terraform
5 hours, 16 minutes ago

Senior Site Reliability Engineer

Block 10K-50K Capital Markets

Block is hiring a Site Reliability Engineer to improve the reliability of its platform and critical infrastructure, with a focus on scalable distributed systems, incident response, and system-wide operational resilience.

AWS CI/CD Datadog DynamoDB Envoy gRPC HTTP Java JSON Kotlin Kubernetes MySQL Terraform
5 hours, 16 minutes ago

Senior DevOps/SRE Engineer

Capital.com 251-1K Capital Markets

Senior DevOps/SRE Engineer role at a global trading platform, responsible for end-to-end ownership of cloud and on-premise environments that support scalable, reliable, and secure infrastructure.

Ansible Argo CD AWS Bash Docker EC2 ELK Stack Fluentd GitLab CI GitOps Go Grafana Helm Kafka Kubernetes Linux Logstash Prometheus Python Terraform
5 hours, 31 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers