Site Reliability Engineer I

2 weeks, 2 days ago
Full-time
Junior
DevOps and Infrastructure
Backblaze

Backblaze

Backblaze is a pioneer in robust, scalable low-cost cloud backup and storage services, offering enterprise hot storage, low-cost backup and archive solutions. With the easiest way to back up all files, Backblaze provides unlimited, unthrottled, and unc...

IT Services
251-1K
Founded 2007

Description

  • Act as the first point of contact for customer-affecting issues and production alerts.
  • Drive resolution of technical problems and support timely incident handling.
  • Follow incident management processes and complete post-mortems to identify improvements.
  • Provide consistent communication to management during incidents and operational events.
  • Respond to Zabbix alerts, take direct action when needed, or escalate appropriately.
  • Ensure escalations are handed off successfully to the right owners.
  • Monitor pod health across sites and perform daily filesystem checks for pods.
  • Troubleshoot infrastructure and deployment issues for Data Center Technicians, including migration and Ansible playbook issues.
  • Identify and escalate potential network issues and support network-related deployment readiness.
  • Support Vault pre-deployment configuration, testing, migrations, and migration pod health checks.
  • Document operational procedures and help automate daily tasks.
  • Monitor server farm releases and updates, escalating issues as they arise.
  • Participate in on-call rotation and work outside normal business hours as needed.
  • Assist other TechOps team members and recommend process improvements to increase productivity.

Requirements

  • Must be located in Bangalore.
  • 2-4 years of relevant experience.
  • Knowledge of sysadmin and Linux skills.
  • Knowledge of network cabling, network classification, and network topology.
  • Strong analytical thinking.
  • Strong communication skills and ability to work with different teams.
  • Desire to learn and develop necessary technical skills.
  • Ability to work outside normal business hours, including weekends, holidays, and evenings, as needed.

Benefits

  • RSU grants for full-time employees.
  • Annual company bonus plan.
  • Healthcare for family, including dental and vision coverage.
  • 401(k) retirement plan.
  • ESPP program.
  • Flexible vacation policy.
  • Maternity and paternity leave.
  • MacBook Pro for work plus a generous stipend to personalize your workstation.
  • Childcare bonus.
  • Fertility treatment and support.
  • Learning and development program.
  • Commuter benefits.
  • Culture that supports a healthy work-life balance.
  • Expected salary range of $66,000 - $88,000.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer

Alpaca 51-250 Capital Markets

Alpaca is hiring a Site Reliability Engineer to keep its brokerage platform reliable and operable across cloud, Kubernetes, observability, messaging, and database systems, with a strong focus on PostgreSQL reliability on the trading-critical path.

DNS GitOps Go Kafka Kubernetes Linux Load Balancing PostgreSQL Python RabbitMQ Secrets Management TLS
2 hours, 28 minutes ago

Site Reliability Engineer

Kaseya 1K-5K IT Services

Kaseya is hiring a Site Reliability Engineer to own the reliability, automation, and production stability of its AWS-based services used by thousands of MSPs worldwide.

Ansible AWS Chef CloudFormation Datadog DevSecOps Elasticsearch Kibana Kubernetes MySQL PostgreSQL Puppet Secrets Management Serverless Terraform
6 hours, 28 minutes ago

SRE - DevOps Engineer - Argentina

Coderio 51-250 Internet Software & Services

Coderio is hiring a remote DevOps/SRE Engineer in Argentina to ensure the stability, scalability, and efficient operation of the infrastructure that supports its global digital solutions.

Argo CD CI/CD Flux GitHub Actions GitOps Helm Jenkins Kubernetes OpenShift Terraform
10 hours, 8 minutes ago

Senior Site Reliability Engineer

Cribl 251-1K IT Services

Cribl is hiring a Senior Site Reliability Engineer in Poland to help build and operate the telemetry infrastructure and observability platform that supports its cloud products and enterprise customers.

Ansible AWS Azure CI/CD Grafana JavaScript Kibana Linux New Relic Node.js PagerDuty Prometheus Splunk Terraform TypeScript
17 hours, 41 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers