Sustaining Engineering Lead

2 hours, 36 minutes ago
Full-time
Lead
DevOps and Infrastructure
Actian

Actian

Actian Corp provides data management solutions for organizations to make confident, data-driven decisions and accelerate growth through hybrid data management, integration, and analytics services worldwide.

IT Services
251-1K
Founded 2005

Description

  • Own the full escalation lifecycle from initial triage through verified production deployment.
  • Immediately assign incoming escalations and ensure there are no gaps in response ownership.
  • Qualify issues by verifying reproducibility, severity, workarounds, and whether the case is a defect or feature gap.
  • Lead and optimize the rotating engineer squad drawn from feature teams.
  • Ensure rotating engineers retain ownership of their tickets until resolution, even after their rotation ends.
  • Act as the Incident Manager during high-severity outages and coordinate the engineering response.
  • Serve as the primary technical bridge between Engineering, Support, Customer Success, and Product.
  • Join customer calls when high-level technical input or escalation support is needed.
  • Track operational metrics such as response time, qualification time, and ready state to identify bottlenecks.
  • Verify fixes move successfully through CI/CD and are deployed to production.

Requirements

  • Strong background in Software Engineering, Site Reliability Engineering, or Tier 3 Support Engineering.
  • Experience with complex SaaS architectures, data platforms, or distributed systems.
  • Proven experience managing production outages and critical enterprise customer escalations.
  • Ability to lead calmly, decisively, and communicatively during high-pressure incidents.
  • Deep expertise with Jira, GitHub, CI/CD pipelines, and modern monitoring tools.
  • Exceptional verbal and written English communication skills.
  • Ability to translate technical issues into clear business impact for internal executives and enterprise clients.
  • Comfort using AI to detect and triage incidents.
  • High-agency, proactive mindset with strong personal accountability.
  • Experience working in a remote or hybrid environment (preferred).

Benefits

  • Competitive salary and benefits package.
  • Flexible work arrangements, including remote or hybrid options.
  • Opportunities for professional growth and development.
  • The chance to join a fast-growing company in the data management space.
  • Collaboration with a passionate and diverse team.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer

Omilia 251-1K IT Services

Senior Site Reliability Engineer at Omilia, responsible for operating production cloud infrastructure, improving observability, and driving reliability across the software delivery lifecycle.

Agile Ansible AWS Bash CentOS Go Grafana Kubernetes MySQL PostgreSQL Prometheus Python Redis RHEL TCP/IP Terraform
43 minutes ago

Senior Site Reliability Engineer (SRE)

The Investigo Group Professional Services

The Investigo Group is hiring a Senior Site Reliability Engineer to operate and mature its production Kubernetes and OpenShift platforms across secure on-premises and hybrid environments.

Ansible Argo CD CI/CD Flux GitHub Actions GitOps Go Grafana Helm Juniper Kubernetes Linux Load Balancing Machine Learning OpenID Connect OpenShift OpenTelemetry Palo Alto Prometheus Python SAML Shell Scripting Terraform
1 hour, 13 minutes ago

Technical Lead Customer Support

Aspire Software 251-1K Internet Software & Services

Exeevo is hiring a Technical Lead to oversee operational support for its Microsoft Dynamics 365 CRM platform, coordinating incident resolution, customer reporting, and cross-functional collaboration for life sciences clients.

CRM
1 hour, 30 minutes ago

[Job-29693] Senior Java Developer, Brazil (Sustentação)

CI&T 5K-10K Internet Software & Services

CI&T is hiring a Senior Java Developer in Brazil to support and stabilize critical payment applications in a remote/hybrid environment, ensuring the performance and reliability of systems that process essential transactions.

AWS Java JIRA Kafka MongoDB
2 hours, 11 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers