Incident Commander

1 month ago
Full-time
Senior
DevOps and Infrastructure
Caseware

Caseware

CaseWare International Inc. provides cutting-edge software solutions for accounting firms, corporations, and governments, enabling users worldwide to work smarter and transform insights into impact.

Internet Software & Services
251-1K
Founded 1988

Description

  • Initiate and oversee incident response efforts as the primary point of coordination after an incident is detected.
  • Act as the authoritative voice during incidents and drive teams toward rapid resolution.
  • Collaborate with engineers, product management, support, and other cross-functional teams during active incidents.
  • Use and integrate tools such as JIRA, PagerDuty, New Relic, AWS, and Microsoft Teams to monitor and coordinate incident handling.
  • Ensure the right stakeholders are engaged to support recovery and resolution efforts.
  • Communicate timely updates, resolution plans, and incident status to internal and external audiences.
  • Track and report uptime metrics to promote transparency in system reliability and performance.
  • Lead post-mortem sessions and produce PIR and RCA documentation, including timelines, impact, root cause, remediation, and preventive actions.
  • Follow up on action items from post-incident reviews to help prevent recurrence.
  • Implement proactive strategies and tools to reduce operational risk and strengthen system resilience.

Requirements

  • 5+ years of experience managing critical incidents in SaaS environments.
  • Experience in a similar role, preferably within a software or technology company.
  • Prior knowledge of cloud environments, AWS, DevOps practices, or related technical operations.
  • Strong technical background in incident management and response.
  • Proven ability to lead teams through rapid incident resolution.
  • Solid understanding of the modern software landscape.
  • Familiarity with JIRA and PagerDuty integrations.
  • Excellent written and verbal communication skills.
  • Strong English communication and collaboration skills.
  • Ability to perform well under pressure and manage competing priorities effectively.

Benefits

  • Remote, full-time permanent role.
  • Flexible work options.
  • Generous time-off policies.
  • Competitive salary.
  • Comprehensive benefits, including health insurance and retirement plans.
  • Performance bonuses and recognition programs.
  • Opportunities for career growth.
  • Opportunity to work on international projects with a global team.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Manager, Software Engineering

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is seeking a Senior Manager to lead CorpTech Platform software teams that build and operate AI-enabled production systems and improve how internal engineering work is designed, shipped, and maintained.

CI/CD Computer Vision ERP LLM Microservices
22 minutes ago

Senior Site Reliability Engineer

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is hiring a Site Reliability Engineer for its Mission Autonomy team to support the reliability and operational excellence of autonomous systems used across cloud, hardware-in-the-loop, and air-gapped environments.

Ansible AWS Azure DNS Docker GCP Go HTTP Kubernetes Linux Load Balancing Puppet Python Splunk TCP/IP Terraform
22 minutes ago

Staff Site Reliability Engineer

Veeam Software 1K-5K Internet Software & Services

Veeam is hiring a Staff Site Reliability Engineer to lead reliability and observability efforts across its global platform and help shape resilient architecture and SRE practices at scale.

Azure C# Go Grafana Java JavaScript Kubernetes OpenTelemetry Prometheus Pulumi Terraform TypeScript
37 minutes ago

Site Reliability Engineer

66degrees 251-1K IT Services

66degrees is hiring a Site Reliability Engineer to help enterprise cloud clients maintain, optimize, and scale Google Cloud environments through reliability engineering, automation, and incident response.

Agile Datadog GCP JIRA Kanban Kubernetes Linux Prometheus Python Scrum SQL Server Terraform
53 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers