Capital Rx

Capital Rx

Capital Rx provides comprehensive health benefit management and transparent pharmacy benefit management solutions, integrating various healthcare services to support millions of plan members across diverse sectors.

Health Care Providers & Services
251-1K
Founded 2017

Description

  • Define, own, and build the company-wide observability strategy, tooling, and platform products.
  • Architect, implement, and maintain the LGTM stack across engineering teams.
  • Build production-grade internal observability products with React/TypeScript frontends and Python/Rust backends.
  • Develop high-performance log indexing and search solutions for large-scale log data.
  • Design and implement SQL-based analytics workflows for ad hoc and historical log analysis.
  • Integrate AWS observability services with the custom observability platform to provide unified visibility.
  • Create dashboards, monitors, and alerting systems that reduce noise and detect anomalies.
  • Partner with engineering teams to establish logging, metrics, and tracing standards and instrument services effectively.
  • Lead workshops, create documentation, and build self-service tooling to drive observability adoption.
  • Mentor engineers, lead architecture reviews, and represent the Scalability team in cross-functional planning.

Requirements

  • 10+ years of software engineering or infrastructure engineering experience with progression into technical leadership roles.
  • Several years of experience leading technical initiatives, building platform products, or serving as an observability subject matter expert.
  • Strong experience with React/TypeScript for frontend development and Python (Flask/SQLAlchemy) for backend services.
  • Deep production experience with the LGTM stack, including Loki, Grafana, Tempo, and Prometheus/Mimir.
  • Extensive experience with AWS CloudWatch Logs and Metrics, including custom metrics, log insights, dashboards, and integrations.
  • Production experience with SQL-based log analytics using AWS Athena, DuckDB, or similar query engines.
  • Demonstrated ability to architect solutions using both managed cloud services and open-source tooling.
  • Hands-on experience with search and indexing systems such as OpenSearch, Elasticsearch, Lucene, or Tantivy.
  • Experience building high-performance systems that process millions of log lines or high-cardinality metrics.
  • Deep understanding of distributed systems and microservices architectures, and the observability challenges they create.
  • Proven track record handling high-volume structured and unstructured logging data and building efficient search/query solutions.
  • Ability to build internal platform products with strong attention to UX, performance, and reliability.
  • Production experience with Rust for high-performance data processing, indexing, or search systems preferred.
  • Experience with Terraform for observability infrastructure and AWS resources preferred.
  • Experience with Datadog, New Relic, Splunk, or other enterprise observability platforms preferred.
  • Deep expertise with PromQL, LogQL, SQL optimization, and query optimization for high-cardinality data preferred.
  • Experience with Parquet, ORC, or other columnar storage formats for S3-based analytics preferred.
  • Experience designing incident response workflows, postmortems, and SLO/SLI frameworks preferred.
  • Track record of reducing observability costs while maintaining or improving capabilities preferred.
  • Experience with streaming data pipelines, ETL, or real-time data processing preferred.
  • Deep knowledge of OpenTelemetry, Jaeger, Zipkin, or distributed tracing architectures preferred.
  • Git expertise and experience working in a monorepo preferred.
  • Previous PBM or healthcare technology experience preferred.
  • Experience building developer tools or internal platforms that improve engineering productivity preferred.

Benefits

  • Remote work location.
  • Salary range of $160,000 to $220,000 USD.
  • Equal employment opportunity and a workplace committed to diversity and inclusion.
  • Privacy notice and retention of application data for future consideration.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer

Alpaca 51-250 Capital Markets

Alpaca is hiring a Site Reliability Engineer to keep its brokerage platform reliable and operable across cloud, Kubernetes, observability, messaging, and database systems, with a strong focus on PostgreSQL reliability on the trading-critical path.

DNS GitOps Go Kafka Kubernetes Linux Load Balancing PostgreSQL Python RabbitMQ Secrets Management TLS
1 hour, 29 minutes ago

Site Reliability Engineer

Kaseya 1K-5K IT Services

Kaseya is hiring a Site Reliability Engineer to own the reliability, automation, and production stability of its AWS-based services used by thousands of MSPs worldwide.

Ansible AWS Chef CloudFormation Datadog DevSecOps Elasticsearch Kibana Kubernetes MySQL PostgreSQL Puppet Secrets Management Serverless Terraform
5 hours, 29 minutes ago

Senior AI Platform Engineer

Wellhub 1-10 Gas Utilities

Wellhub is hiring a Senior AI Platform Engineer in Brazil to help build and evolve the cloud-native ML development platform that enables engineers and data scientists to develop and deploy AI at scale.

Apache Spark AWS CI/CD Kubeflow Kubernetes MLOps Python Terraform
7 hours, 3 minutes ago

SRE - DevOps Engineer - Argentina

Coderio 51-250 Internet Software & Services

Coderio is hiring a remote DevOps/SRE Engineer in Argentina to ensure the stability, scalability, and efficient operation of the infrastructure that supports its global digital solutions.

Argo CD CI/CD Flux GitHub Actions GitOps Helm Jenkins Kubernetes OpenShift Terraform
9 hours, 9 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers