AlphaSense

AlphaSense

AlphaSense develops an artificial intelligence-based search platform that enables investment and corporate professionals to quickly access and analyze extensive financial data and market insights from over 500 million documents, enhancing decision-maki...

Internet Software & Services
251-1K
Founded 2011
$770M raised

Description

  • Architect reliability frameworks and self-service tooling that enable teams to own the reliability of their services.
  • Drive AIOps initiatives to automate diagnostics, remediation, and proactive failure prevention.
  • Embed SRE practices across engineering through design reviews, production readiness, and operational standards.
  • Serve as Incident Commander during critical incidents and ensure blameless postmortems lead to durable improvements.
  • Deliver end-to-end monitoring, tracing, and profiling to improve system performance proactively.
  • Mentor engineers across SRE and product teams through technical guidance and knowledge sharing.
  • Influence architectural decisions and set the technical bar for reliability across the organization.
  • Lead by example in incident response and help scale a “You Build It, You Run It” culture.

Requirements

  • 8+ years of experience in Site Reliability Engineering, DevOps, or a similar role.
  • At least 3+ years of experience in a Senior+ SRE position.
  • Experience running production SaaS systems at scale.
  • Proficiency in at least one programming or scripting language such as Python or Go.
  • Hands-on experience with cloud platforms such as AWS, GCP, or Azure and Kubernetes.
  • Deep understanding of networking fundamentals, including TCP/IP, DNS, HTTP/S, and load balancing.
  • Experience with monitoring and alerting tools such as Prometheus, Grafana, Datadog, or ELK.
  • Familiarity with advanced observability tooling such as OTEL and continuous profiling.
  • Proven incident management experience, including leading high-severity incidents and postmortems.
  • Strong troubleshooting, communication, and collaboration skills.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

NoSQL Database Engineer II

LivePerson 1K-5K Internet Software & Services

LivePerson is hiring a NoSQL Database Engineer (L2) in India to support production reliability and platform engineering for large-scale NoSQL systems and cloud infrastructure.

Bash Cassandra Couchbase GCP Go Grafana Prometheus Python Redis Terraform
10 hours, 11 minutes ago

Sr. Production Engineer, Solutions Engineering

Pinterest 5K-10K Internet Software & Services

Pinterest is hiring a Senior Production Engineer on Solutions Engineering to design AI-driven reliability and automation systems that improve the operation of large-scale distributed infrastructure serving hundreds of millions of users.

Ansible AWS Azure Chef Docker Envoy GCP Go Hadoop Kafka Kubernetes Linux MySQL Puppet Python Terraform Unix
10 hours, 11 minutes ago

Senior Network Site Reliability Engineer

Miro 1K-5K Internet Software & Services

Miro is hiring a Senior Network Site Reliability Engineer to strengthen the reliability, availability, and scalability of its AWS-based production infrastructure.

Agile AWS Azure Bash CI/CD DNS EC2 GCP GitHub GitLab Kubernetes Linux Python TCP/IP Terraform
10 hours, 26 minutes ago

Sênior Site Reliability Engineer - Network

Harford County Public Library 51-250 Diversified Consumer Services

Stone Tech, da Stone Co., busca um Senior Site Reliability Engineer - Network para liderar projetos críticos de infraestrutura de redes e evoluir a arquitetura global de conectividade do grupo.

Ansible API Gateway AWS Azure Cisco Datadog Fortinet GCP Kong Palo Alto Prometheus SIEM Splunk Terraform Zabbix
10 hours, 41 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers