Senior AIOps Engineer, Incident Response [Remote-US]

42 minutes ago
Full-time
Senior
DevOps and Infrastructure
Quanata

Quanata

Quanata is a software development company based in San Francisco, specializing in context-based insurance solutions. The company leverages AI, real-time telematics, and data science to enhance risk prediction, promote safer driving behaviors, and create modern insurance products. Quanata aims to transform the insurance industry by fostering positive behaviors and advancing digital experiences. The company develops a range of software platforms and tools for insurers. Their offerings include AI-powered risk assessment, telematics for driver monitoring, and claims solutions that optimize and automate processes. Quanata also focuses on customer engagement through personalized products and retention tools, supporting insurtech modernization with big data analytics and cloud-native platforms. With a team of around 26 professionals, Quanata draws on talent from Silicon Valley to drive innovation in the insurance sector.

information technology & services
201-500

Description

  • Own production health, reliability, and operational support processes across critical systems and services.
  • Lead incident response efforts, stakeholder communication, root cause analysis, and post-incident reviews.
  • Identify patterns in production issues and drive improvements to reduce recurring incidents and operational overhead.
  • Design and implement AI-driven agents and workflows that automate support and operational tasks.
  • Partner with engineering, product, and AI orchestration teams to improve system resilience and operational efficiency.
  • Build and maintain operational runbooks, documentation, and knowledge base content for human and AI-assisted workflows.
  • Support observability, monitoring, and troubleshooting across cloud-based production environments.
  • Participate in on-call rotations and continuously improve operational readiness and response processes.

Requirements

  • 6–8 years of experience in production operations, site reliability engineering, technical support engineering, or similar operational roles.
  • Strong background in incident management, root cause analysis, and production system troubleshooting.
  • Experience working within modern SDLC, DevOps, and change management environments.
  • Familiarity with operational tooling such as Jira, Confluence, and observability/monitoring platforms.
  • Strong analytical and problem-solving skills with the ability to identify trends and drive operational improvements.
  • Comfortable working cross-functionally with engineering, product, operations, and leadership teams.
  • Strong communication skills and ability to operate effectively in fast-moving technical environments.
  • Bachelor’s degree in Computer Science, Engineering, or equivalent relevant experience.
  • Experience building or working with AI/LLM-powered systems, intelligent agents, or workflow automation tools (bonus point).
  • Familiarity with cloud platforms such as AWS and modern observability ecosystems (bonus point).
  • Experience with event-driven architectures, orchestration frameworks, or operational automation platforms (bonus point).
  • Background leading operational transformation or reliability improvement initiatives (bonus point).

Benefits

  • Salary range of $215,000 to $280,000.
  • Medical, dental, vision, life insurance, and supplemental income plans for employees and dependents.
  • Headspace app subscription and a monthly wellness allowance.
  • 401(k) plan with company match.
  • One-time $2,000 home office equipment allowance for remote work.
  • Four weeks of PTO in the first year.
  • Twelve weeks of fully paid parental leave for new parents.
  • Up to $5,000 per year for professional learning, continuing education, and career development, plus LinkedIn Learning and BetterUp access.
  • Remote-first work environment within the U.S., with occasional travel not required for most positions.
  • Core meeting hours from 9 AM to 2 PM Pacific time.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Software Engineer | Remote | AI SaaS Software Role

Process Street 51-250 Internet Software & Services

Process Street is hiring a Senior Software Engineer to help build its AI-native workflow platform and shape how the team develops software in an AI-first environment.

AWS Chakra UI CircleCI Docker LLM Microservices Play Framework PostgreSQL React Redis Scala
57 minutes ago

A.I. Engineering Intern (Remote)

Sezzle 251-1K Diversified Financial Services

Sezzle is hiring an AI Engineering Intern to join its Data Science team and help design foundational AI systems that support smarter shopping and improved merchant outcomes.

AWS Elasticsearch Git GitLab Go Kubernetes LLM MySQL PostgreSQL Python React React Native SQL TypeScript
57 minutes ago

AI Architect

Nimble Gravity 51-250 IT Services

Nimble Gravity is seeking an AI Architect to design and deliver Azure-based AI applications that integrate cloud services, software engineering, and real-world AI workflows.

Azure C# CI/CD FastAPI Material UI Python React TypeScript
57 minutes ago

Site Reliability Engineer II

Backblaze 251-1K IT Services

Backblaze is hiring a Site Reliability Engineer II to support the stability, scalability, and reliability of customer-facing cloud storage services and the infrastructure that powers them.

Ansible AWS Azure Bash CI/CD Docker GCP Go Grafana Jenkins Kubernetes Linux Microservices Prometheus Python Terraform
1 hour, 12 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers