Blink Health

Blink Health is a digital health company revolutionizing the prescription medication industry by providing affordable and accessible medications to millions of people across America. Their cloud-based pharmacy platform eliminates traditional roadblocks...

Health Care Providers & Services

Health Care

251-1K (360)

Founded 2014

$165M raised

7 open positions

Links

View All Jobs

Senior Site Reliability Engineer

1 month, 1 week ago

India

Full-time

Lead

Site Reliability Engineer (SRE)

DevOps and Infrastructure

Agile Ansible AWS Azure Bash CloudFormation DNS GCP Go Helm Kubernetes Linux Load Balancing Microservices Pulumi Python React Secrets Management TCP/IP Terraform

Apply Now

Blink Health

Health Care Providers & Services

251-1K

Founded 2014

$165M raised

View All Jobs 7

Description

Establish and evolve SRE best practices across reliability, incident response, postmortems, error budgets, and operational readiness.
Define and drive the observability strategy, including SLIs/SLOs, alerting quality, dashboards, and service health indicators.
Design and implement software-driven infrastructure solutions that automate manual work and reduce operational toil.
Act as a technical leader and influence priorities across cloud infrastructure, reliability tooling, and platform architecture.
Own large, ambiguous initiatives from concept to delivery while aligning stakeholders across engineering, security, and product.
Improve platform resilience, scalability, performance, and compliance through infrastructure and security-focused engineering work.
Identify systemic risks and reliability gaps early and lead platform upgrades and architectural improvements.
Partner with engineering teams to improve developer workflows, tooling, and operational maturity.
Provide technical mentorship, architecture guidance, and high-quality design and code reviews.
Lead documentation and knowledge sharing so systems and processes are resilient to individual ownership.
Participate in and help mature incident response, escalation practices, and post-incident learning.

Requirements

Bachelor’s or Master’s degree in Computer Science or equivalent practical experience.
10+ years of experience in site reliability engineering, infrastructure engineering, or platform engineering roles with demonstrated impact at scale.
Expert-level troubleshooting across the full stack, from application to kernel to network.
Strong command-line proficiency and deep expertise in Linux systems and operating system fundamentals.
Advanced understanding of networking concepts including load balancing, proxies, DNS, TCP/IP, NAT, and service-to-service communication.
Experience with multiple languages such as Python, Go, and Bash, plus familiarity troubleshooting application stacks like React or similar.
Strong track record of automating repetitive and complex operational work to reduce toil and increase reliability.
Ability to design and build internal tools in Python or Go that standardize and scale engineering practices.
Deep experience with cloud platforms, preferably AWS, with GCP or Azure also acceptable.
Strong expertise in Kubernetes and container orchestration, including EKS and Helm.
Proven experience designing and implementing observability systems, including metrics, logging, tracing, dashboards, and alerting.
Deep understanding of container technologies, security scanning, secrets management, dynamic configuration, and microservices architectures.
Familiarity with service meshes and advanced traffic management concepts.
Experience designing and maintaining company-wide infrastructure-as-code codebases using Terraform, Pulumi, CloudFormation, or Ansible.
Ability to think holistically about infrastructure design, cost, reliability, security, and long-term maintainability.
Comfort operating in an agile environment with disciplined testing and quality practices.

Benefits

Opportunity to work on products that improve prescription access and affordability for millions of patients.
High-impact role at a fast-growing healthcare technology company.
Collaborative, cross-functional team environment.
Equal opportunity employer committed to diversity.
SMS or MMS application status updates for applicants who opt in.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer

CSC Generation 251-1K Internet Software & Services

Backcountry is hiring a Site Reliability Engineer in Costa Rica to keep its ecommerce platform reliable, scalable, and observable across a multi-cloud environment.

Costa Rica Full-time Mid Level Site Reliability Engineer (SRE)

Ansible Argo CD AWS AWS CDK Bash CI/CD Docker GCP GitOps Grafana Helm Kubernetes Linux Node.js OpenSearch Prometheus Python Terraform TypeScript

5 hours, 28 minutes ago

Apply

5 hours, 28 minutes ago

Ssr Monitoring and Observability Analyst

Coderio 51-250 Internet Software & Services

Coderio is hiring an Observability & Monitoring Analyst to design and operate monitoring systems that improve availability, performance, and incident response across global clients’ IT environments.

Mexico Colombia Argentina Uruguay Contract Mid Level Site Reliability Engineer (SRE)

AWS Azure Bash Datadog DNS Docker ELK Stack Fluentd GCP Grafana Jaeger Kibana Kubernetes Linux Load Balancing Logstash New Relic OpenTelemetry Prometheus Python TCP/IP Zipkin

6 hours, 28 minutes ago

Apply

6 hours, 28 minutes ago

Senior Site Reliability Engineer

Counterpart Health 51-200 hospital & health care

Counterpart Health is hiring a Senior Site Reliability and Infrastructure Engineer to support and evolve the technology platform behind its primary care tool and maintain reliable infrastructure for domestic and international workloads.

United States Full-time Senior Site Reliability Engineer (SRE)

$160k-$208k

AWS Azure CI/CD Containerd DNS Docker GCP Go gRPC Helm Kubernetes Linux Load Balancing Prometheus Python Shell Scripting TCP/IP

2 days, 6 hours ago

Apply

2 days, 6 hours ago

Senior Test Platform & Reliability Engineer - Star Trek Fleet Command

Scopely 1K-5K Internet Software & Services

Scopely is hiring a Senior Test Platform & Reliability Engineer in Ireland to build validation, reliability, and developer enablement platforms for Star Trek Fleet Command’s large-scale live-service backend systems.

Ireland Full-time Senior SDET (Software Development Engineer in Test) Site Reliability Engineer (SRE)

AWS Bash CI/CD Docker GitLab Go Python Terraform

2 days, 6 hours ago

Apply

2 days, 6 hours ago

Blink Health

Tags

Links

Senior Site Reliability Engineer

Blink Health

Description

Requirements

Benefits

Similar Roles

Site Reliability Engineer

Ssr Monitoring and Observability Analyst

Senior Site Reliability Engineer

Senior Test Platform & Reliability Engineer - Star Trek Fleet Command

You're on a roll! Sign up now to keep applying.