Blink Health

Blink Health is a digital health company revolutionizing the prescription medication industry by providing affordable and accessible medications to millions of people across America. Their cloud-based pharmacy platform eliminates traditional roadblocks...

Health Care Providers & Services

Health Care

251-1K (360)

Founded 2014

$165M raised

7 open positions

Links

View All Jobs

Senior Cloud Resilience Architect

1 month, 1 week ago

India

Lead

Site Reliability Engineer (SRE)

DevOps and Infrastructure

Ansible AWS Azure CloudFormation DNS GCP Kubernetes Load Balancing Pulumi Terraform

Apply Now

Blink Health

Health Care Providers & Services

251-1K

Founded 2014

$165M raised

View All Jobs 7

Description

Evaluate and improve the organization’s disaster recovery posture, including RTO/RPO, dependency mapping, and failure domain analysis.
Define, document, and establish disaster recovery standards and best practices across cloud infrastructure, platforms, and application architectures.
Partner with SRE, platform, security, and product engineering teams to design resilient, fault-tolerant systems.
Lead the disaster recovery roadmap by balancing technical feasibility, cost, risk, and business priorities.
Design reference architectures for disaster recovery patterns such as pilot-light, warm standby, hot standby, and active-active.
Drive adoption of active-active disaster recovery for critical systems, including traffic management, data replication, consistency, and automated failover.
Define and operationalize DR testing strategies, including game days, chaos testing, and regular recovery exercises.
Establish documentation, runbooks, and escalation paths to ensure recoverability is clear and not dependent on individuals.
Evaluate and recommend platform upgrades, cloud services, and tooling that improve resilience, recovery speed, and reliability.
Serve as a technical advisor and mentor on disaster recovery and resilience for leadership and engineering teams.

Requirements

Bachelor’s or Master’s degree in Computer Science or equivalent practical experience.
8+ years of experience in cloud infrastructure, platform engineering, SRE, or reliability-focused architecture roles.
Deep understanding of disaster recovery concepts including RTO/RPO, blast radius reduction, failure domains, and dependency isolation.
Proven experience designing and implementing multi-region and multi-availability zone architectures.
Hands-on experience moving systems toward active-active or highly available architectures.
Strong grasp of data replication strategies, consistency tradeoffs, and recovery patterns for databases and stateful systems.
Extensive experience with major cloud providers, with AWS preferred and GCP/Azure acceptable.
Experience with Kubernetes-based platforms, including regional failover, workload portability, and cluster recovery strategies.
Experience designing and maintaining Infrastructure as Code using tools such as Terraform, Pulumi, CloudFormation, or Ansible.
Experience defining and running DR tests, game days, and failure simulations.

Benefits

Opportunity to have a large impact on patients’ access to affordable medications.
Work on a fast-growing healthcare technology company with a mission-driven product.
Join a highly collaborative, cross-functional team of builders and operators.
Equal opportunity employer committed to diversity and inclusion.
Potential for application-related SMS or MMS status updates if consent is provided.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer

CSC Generation 251-1K Internet Software & Services

Backcountry is hiring a Site Reliability Engineer in Costa Rica to keep its ecommerce platform reliable, scalable, and observable across a multi-cloud environment.

Costa Rica Full-time Mid Level Site Reliability Engineer (SRE)

Ansible Argo CD AWS AWS CDK Bash CI/CD Docker GCP GitOps Grafana Helm Kubernetes Linux Node.js OpenSearch Prometheus Python Terraform TypeScript

5 hours, 29 minutes ago

Apply

5 hours, 29 minutes ago

Ssr Monitoring and Observability Analyst

Coderio 51-250 Internet Software & Services

Coderio is hiring an Observability & Monitoring Analyst to design and operate monitoring systems that improve availability, performance, and incident response across global clients’ IT environments.

Mexico Colombia Argentina Uruguay Contract Mid Level Site Reliability Engineer (SRE)

AWS Azure Bash Datadog DNS Docker ELK Stack Fluentd GCP Grafana Jaeger Kibana Kubernetes Linux Load Balancing Logstash New Relic OpenTelemetry Prometheus Python TCP/IP Zipkin

6 hours, 29 minutes ago

Apply

6 hours, 29 minutes ago

Senior Site Reliability Engineer

Counterpart Health 51-200 hospital & health care

Counterpart Health is hiring a Senior Site Reliability and Infrastructure Engineer to support and evolve the technology platform behind its primary care tool and maintain reliable infrastructure for domestic and international workloads.

United States Full-time Senior Site Reliability Engineer (SRE)

$160k-$208k

AWS Azure CI/CD Containerd DNS Docker GCP Go gRPC Helm Kubernetes Linux Load Balancing Prometheus Python Shell Scripting TCP/IP

2 days, 6 hours ago

Apply

2 days, 6 hours ago

Senior Test Platform & Reliability Engineer - Star Trek Fleet Command

Scopely 1K-5K Internet Software & Services

Scopely is hiring a Senior Test Platform & Reliability Engineer in Ireland to build validation, reliability, and developer enablement platforms for Star Trek Fleet Command’s large-scale live-service backend systems.

Ireland Full-time Senior SDET (Software Development Engineer in Test) Site Reliability Engineer (SRE)

AWS Bash CI/CD Docker GitLab Go Python Terraform

2 days, 6 hours ago

Apply

2 days, 6 hours ago

Blink Health

Tags

Links

Senior Cloud Resilience Architect

Blink Health

Description

Requirements

Benefits

Similar Roles

Site Reliability Engineer

Ssr Monitoring and Observability Analyst

Senior Site Reliability Engineer

Senior Test Platform & Reliability Engineer - Star Trek Fleet Command

You're on a roll! Sign up now to keep applying.