Site Reliability Engineer

4 weeks, 1 day ago
Full-time
Senior
DevOps and Infrastructure
TextNow

TextNow

TextNow is a leading provider of free phone service, offering calling and texting through its app and SIM card. With a focus on affordability and innovation, TextNow is revolutionizing mobile phone service with cloud-based technology, providing users w...

Wireless Telecommunication Services
51-250
Founded 2009

Description

  • Design, build, and maintain scalable, resilient, highly available systems for TextNow’s infrastructure and services.
  • Develop and maintain infrastructure automation using Terraform, Ansible, and related tools.
  • Support cloud deployment, scaling, and operations for AWS-based systems.
  • Participate in an on-call rotation and respond to production incidents.
  • Troubleshoot issues, drive incident resolution, and reduce downtime.
  • Conduct post-mortems and implement corrective actions to improve reliability.
  • Implement and improve observability through logging, metrics, and monitoring solutions.
  • Collaborate with software engineers, DevOps, and product teams to improve reliability from development to production.
  • Identify opportunities to improve architecture, automation, and operational practices.
  • Contribute to the design and implementation of new SRE best practices.

Requirements

  • 5+ years of experience in an operationally focused role such as SRE, DevOps, or Infrastructure Engineering.
  • Deep understanding of reliability, scalability, and performance optimization.
  • Hands-on experience with AWS, GitHub, Terraform, Ansible, or similar tools.
  • Experience handling production incidents, performing root cause analysis, and implementing long-term fixes.
  • Strong focus on automation and scripting to reduce operational toil.
  • Experience building robust observability with logging, metrics, and monitoring tools.
  • Ability to work cross-functionally with engineers, product teams, and leadership.
  • Experience in a remote or distributed working environment is preferred.
  • Canada-based role with compensation listed in CAD and select USD markets.
  • Applicants must be eligible to work in the relevant hiring location.

Benefits

  • Competitive pay with a stated salary range of $113,400 - $162,000 CAD.
  • Employee stock options.
  • Unlimited vacation and 12 paid holidays per year.
  • Flexible work arrangements, including work-from-home, remote, or office access.
  • Health, dental, and vision benefits.
  • Short-term and long-term disability coverage.
  • $750 annual wellness benefit or healthcare spending account.
  • RRSP matching in Canada or 401(k) in the USA.
  • Parental leave for eligible employees.
  • Learning and development opportunities.
  • Free phone service.
  • Strong work-life blend.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer

PlayOn is seeking a Senior Site Reliability Engineer to strengthen the reliability, performance, and scalability of its remote systems supporting high school sports products.

AWS Azure C++ CI/CD Datadog Docker GCP Git Go Grafana Java Kubernetes Linux Prometheus Python Terraform
14 minutes ago

Senior Manager, Software Engineering

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is seeking a Senior Manager to lead CorpTech Platform software teams that build and operate AI-enabled production systems and improve how internal engineering work is designed, shipped, and maintained.

CI/CD Computer Vision ERP LLM Microservices
44 minutes ago

Senior Site Reliability Engineer

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is hiring a Site Reliability Engineer for its Mission Autonomy team to support the reliability and operational excellence of autonomous systems used across cloud, hardware-in-the-loop, and air-gapped environments.

Ansible AWS Azure DNS Docker GCP Go HTTP Kubernetes Linux Load Balancing Puppet Python Splunk TCP/IP Terraform
44 minutes ago

Staff Site Reliability Engineer

Veeam Software 1K-5K Internet Software & Services

Veeam is hiring a Staff Site Reliability Engineer to lead reliability and observability efforts across its global platform and help shape resilient architecture and SRE practices at scale.

Azure C# Go Grafana Java JavaScript Kubernetes OpenTelemetry Prometheus Pulumi Terraform TypeScript
59 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers