Capital Rx

Capital Rx

Capital Rx provides comprehensive health benefit management and transparent pharmacy benefit management solutions, integrating various healthcare services to support millions of plan members across diverse sectors.

Health Care Providers & Services
251-1K
Founded 2017

Description

  • Define, own, and build the company-wide observability strategy, tooling, and platform products.
  • Architect, implement, and maintain the LGTM stack across engineering teams.
  • Build production-grade internal observability products with React/TypeScript frontends and Python/Rust backends.
  • Develop high-performance log indexing and search solutions for large-scale log data.
  • Design and implement SQL-based analytics workflows for ad hoc and historical log analysis.
  • Integrate AWS observability services with the custom observability platform to provide unified visibility.
  • Create dashboards, monitors, and alerting systems that reduce noise and detect anomalies.
  • Partner with engineering teams to establish logging, metrics, and tracing standards and instrument services effectively.
  • Lead workshops, create documentation, and build self-service tooling to drive observability adoption.
  • Mentor engineers, lead architecture reviews, and represent the Scalability team in cross-functional planning.

Requirements

  • 10+ years of software engineering or infrastructure engineering experience with progression into technical leadership roles.
  • Several years of experience leading technical initiatives, building platform products, or serving as an observability subject matter expert.
  • Strong experience with React/TypeScript for frontend development and Python (Flask/SQLAlchemy) for backend services.
  • Deep production experience with the LGTM stack, including Loki, Grafana, Tempo, and Prometheus/Mimir.
  • Extensive experience with AWS CloudWatch Logs and Metrics, including custom metrics, log insights, dashboards, and integrations.
  • Production experience with SQL-based log analytics using AWS Athena, DuckDB, or similar query engines.
  • Demonstrated ability to architect solutions using both managed cloud services and open-source tooling.
  • Hands-on experience with search and indexing systems such as OpenSearch, Elasticsearch, Lucene, or Tantivy.
  • Experience building high-performance systems that process millions of log lines or high-cardinality metrics.
  • Deep understanding of distributed systems and microservices architectures, and the observability challenges they create.
  • Proven track record handling high-volume structured and unstructured logging data and building efficient search/query solutions.
  • Ability to build internal platform products with strong attention to UX, performance, and reliability.
  • Production experience with Rust for high-performance data processing, indexing, or search systems preferred.
  • Experience with Terraform for observability infrastructure and AWS resources preferred.
  • Experience with Datadog, New Relic, Splunk, or other enterprise observability platforms preferred.
  • Deep expertise with PromQL, LogQL, SQL optimization, and query optimization for high-cardinality data preferred.
  • Experience with Parquet, ORC, or other columnar storage formats for S3-based analytics preferred.
  • Experience designing incident response workflows, postmortems, and SLO/SLI frameworks preferred.
  • Track record of reducing observability costs while maintaining or improving capabilities preferred.
  • Experience with streaming data pipelines, ETL, or real-time data processing preferred.
  • Deep knowledge of OpenTelemetry, Jaeger, Zipkin, or distributed tracing architectures preferred.
  • Git expertise and experience working in a monorepo preferred.
  • Previous PBM or healthcare technology experience preferred.
  • Experience building developer tools or internal platforms that improve engineering productivity preferred.

Benefits

  • Remote work location.
  • Salary range of $160,000 to $220,000 USD.
  • Equal employment opportunity and a workplace committed to diversity and inclusion.
  • Privacy notice and retention of application data for future consideration.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer (Senior or Staff), Atlas

MongoDB 1K-5K Internet Software & Services

MongoDB is hiring a Senior Site Reliability Engineer for its Atlas team to help support, maintain, and grow a multi-cloud platform for customer-facing production workloads.

AWS Azure DNS GCP Go HTTP Linux Python Ruby TLS
4 hours, 25 minutes ago

Manager, Software Engineering (Resilience Engineering)

Affirm 1K-5K Diversified Financial Services

Affirm is seeking an Engineering Manager to lead its Resilience Engineering team, building production load testing and chaos engineering capabilities that improve the safety and reliability of production systems.

AWS Java Kotlin Kubernetes Microservices Python
4 hours, 34 minutes ago

Site Reliability Engineer (Senior or Staff), Storage Layer Services (SLS)

MongoDB 1K-5K Internet Software & Services

MongoDB’s Storage Layer Services team is hiring a Site Reliability Engineer to help re-architect the cloud storage layer for Atlas and ensure the reliability and operational safety of its distributed storage infrastructure.

AWS Azure DNS GCP Go Kubernetes Linux Python TCP/IP TLS
5 hours, 22 minutes ago

Staff Platform Engineer AI

Agiloft 51-250 Capital Markets

Agiloft is hiring a Staff Platform Engineer AI to help build and maintain enterprise contract lifecycle management software on a modern, cloud-native platform.

Agile API Gateway AWS CloudFormation Docker DynamoDB Git GitHub Actions NumPy Pandas PostgreSQL Python REST API Scikit-learn SciPy Serverless
6 hours, 33 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers