Block

Block

Block is a company that consists of Square, Cash App, Spiral, TIDAL, TBD, and foundational teams. They are focused on economic empowerment by creating tools to expand access to the economy. Square helps sellers run and grow businesses, Cash App redefin...

Capital Markets
10K-50K
Founded 2009

Description

  • Build and extend platforms that improve system reliability.
  • Work on company-wide reliability goals across Block’s critical infrastructure.
  • Standardize reliability tools across multiple platforms and organizations.
  • Triage, coordinate, and lead stabilization of sev 0–1 incidents.
  • Serve as primary oncall for Tier 0 services and maintain structured escalation paths.
  • Lead incident command, coordinate mitigation, and drive escalation during high-severity events.
  • Drive platform-wide reliability improvements, shared operational tooling, and deploy-safety patterns.
  • Use AI-driven systems to improve signal detection, reduce noise, and accelerate root cause analysis.
  • Design and implement safe deployment patterns, including progressive delivery, automated rollback, and guardrails.

Requirements

  • 5+ years of software development experience.
  • Experience running production oncall for high-availability systems.
  • Strong incident management skills, including structured triage, mitigation under pressure, and blameless postmortems.
  • Fluency with CI/CD pipelines, progressive rollout strategies, and rollback automation.
  • Monitoring and observability expertise, including building and tuning alerts for uptime, error rates, latency regression, and resource exhaustion.
  • Familiarity with AI-driven tooling for observability, incident analysis, or automation.
  • Ability to create and maintain evidence-based maturity assessments using trailing 90-day data windows.
  • Comfort with vendor and dependency management, including maintaining validated escalation contacts reachable within 5 minutes.
  • Demonstrated technical initiative and leadership on previous backend or platform-focused projects.
  • Experience with Kotlin, modern Java 11+, HTTP, JSON, gRPC, Protocol Buffers, MySQL, Vitess, DynamoDB, event-driven architectures, DataDog, LaunchDarkly, Terraform, Kubernetes, Istio/Envoy, or AWS is preferred.

Benefits

  • Healthcare coverage, including medical, vision, and dental insurance.
  • Health Savings Account and Flexible Spending Account options.
  • Retirement plans with company match.
  • Employee Stock Purchase Program and equity plan eligibility.
  • Wellness programs, including mental health access, 1:1 financial planners, and a monthly wellness allowance.
  • Paid parental and caregiving leave.
  • Paid time off, including 12 paid holidays, plus paid sick leave or Flexible Time Off depending on employment category.
  • Learning and development resources.
  • Paid life insurance, AD&D, and disability benefits.
  • Eligible for a sign-on bonus, depending on the role and applicable plans.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer - Canada Wide - Remote

Newton 51-250 Capital Markets

Newton is hiring a remote Site Reliability Engineer across Canada to improve the reliability, resilience, and operational readiness of its crypto trading platform.

AWS Java JavaScript Python
5 hours ago

Site Reliability Engineer - India

Zimperium 251-1K Professional Services

Zimperium is hiring a Senior Site Reliability Engineer in India to improve the reliability, automation, and scalability of its mobile security production systems and applications.

CI/CD Datadog Docker Java Kubernetes Linux Python Unix
5 hours, 15 minutes ago

Senior DevOps/SRE Engineer

Capital.com 251-1K Capital Markets

Senior DevOps/SRE Engineer role at a global trading platform, responsible for end-to-end ownership of cloud and on-premise environments that support scalable, reliable, and secure infrastructure.

Ansible Argo CD AWS Bash Docker EC2 ELK Stack Fluentd GitLab CI GitOps Go Grafana Helm Kafka Kubernetes Linux Logstash Prometheus Python Terraform
5 hours, 30 minutes ago

Senior Software Engineer ||

Samsara 1K-5K IT Services

Samsara is hiring a Senior Software Engineer II for its Operational Excellence team to improve production reliability and developer operations across its globally distributed engineering organization.

AWS Datadog GCP Go Grafana New Relic PagerDuty Python Terraform
5 hours, 30 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers