Veeam Software

Veeam Software

Veeam Software is the global leader in Backup that delivers Modern Data Protection, offering solutions for virtual environments, enterprises, small businesses, and service providers worldwide.

Internet Software & Services
1K-5K
Founded 2006
$500M raised

Description

  • Get up to speed on Veeam Data Cloud workloads, dependencies, and operational workflows by reading code, documentation, and working with subject matter experts.
  • Write and maintain runbooks, incident guides, onboarding materials, and other operational documentation.
  • Participate in incident response, including triage, investigation, mitigation, and postmortems.
  • Help implement and maintain service level indicators, service level objectives, and error budgets.
  • Identify reliability issues and propose concrete improvements during incidents and reviews.
  • Support high availability and fault tolerance work on Azure, including Azure Government.
  • Implement monitoring improvements by adding instrumentation, alerting, and dashboards.
  • Contribute to toil reduction through automation and tooling improvements.
  • Participate in on-call rotations.
  • Work with engineering, security, compliance, and operations teams to deliver reliability improvements.

Requirements

  • 3+ years of experience in Software Engineering, including at least 1 year in SRE, Platform Engineering, or DevOps for cloud-hosted services.
  • Experience with cloud infrastructure on Azure or a comparable cloud provider.
  • Experience working in regulated or compliance-oriented environments such as government, financial, or healthcare.
  • Ability to read and understand code well enough to investigate system behavior independently.
  • Experience with monitoring and observability tools such as Prometheus, Grafana, OpenTelemetry, or the ELK stack.
  • Experience with IaC tools such as Terraform, Terragrunt, or Pulumi, and with Kubernetes.
  • Experience with CI/CD tools such as GitHub Actions, Azure DevOps, GitLab CI, or ArgoCD.
  • Strong programming skills in one or more of TypeScript/JavaScript, Go, Java, or C#, or similar languages.
  • Solid understanding of distributed systems fundamentals and networking basics.
  • Clear written and verbal communication skills.
  • Preferred: experience in Government or Sovereign Cloud environments such as Azure Government or AWS GovCloud.
  • Preferred: background in SaaS platforms or multi-tenant systems.
  • Preferred: familiarity with chaos engineering, resilience testing, or load testing.
  • Preferred: exposure to building or improving reliability practices on a team.
  • Preferred: familiarity with AI-first development workflows using LLM-powered tools for automation, code generation, or documentation.

Benefits

  • Unlimited paid time off, 12 paid holidays, 4 global VeeaMe Days for self-care, and 24 paid volunteer hours annually.
  • Paid parental leave: 8 weeks for all parents and 16 weeks for birthing parents.
  • Medical, dental, and vision coverage starting on the first day.
  • Mental health support, therapy sessions, and digital wellness tools through the Employee Assistance Program.
  • 401(k) retirement plan with company matching contributions.
  • Fertility, adoption, and surrogacy support through Maven.
  • Legal services, identity protection, and supplemental health insurance options.
  • Tax-advantaged spending accounts for healthcare, dependent care, and commuting.
  • Professional development resources including mentorship, training, workshops, on-demand learning libraries, and learning events.
  • Competitive compensation with pay transparency, performance-based bonus, and role-based geographic salary ranges.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Manager, Engineering

Sumo Logic 251-1K Internet Software & Services

Sumo Logic is hiring a Senior Manager, Engineering for Application Security to lead global programs that improve product security, reliability, and operational efficiency across its cloud platform.

Agile AWS C++ Docker GCP Java Kafka Kubernetes OWASP Ruby Scala SIEM
16 hours, 32 minutes ago

Staff Software Engineer - Databases SRE | Sweden | Remote

Grafana 1K-5K IT Services

Grafana Labs is hiring a Staff Software Engineer, SRE to improve the reliability and scalability of Grafana Cloud’s database products for high-value customers across AWS, GCP, and Azure.

AWS Azure GCP Go Helm Java Kubernetes Linux Microservices Python Terraform
1 day, 15 hours ago

Senior Site Reliability Engineer (SRE)

Oowlish 51-250 Internet Software & Services

Oowlish is hiring a Senior Site Reliability Engineer to own the reliability and operational excellence of business-critical production systems for international clients in a remote, collaborative environment.

AWS Datadog Go Heroku Kubernetes PostgreSQL Python SQL Server TypeScript
1 day, 16 hours ago

Staff Software Engineer - Databases SRE | Spain | Remote

Grafana 1K-5K IT Services

Grafana Labs is hiring a Staff Software Engineer - SRE to strengthen the reliability of its cloud database products for high-SLA customers across AWS, GCP, and Azure.

AWS Azure GCP Go Helm Java Kubernetes Linux Python Terraform
1 day, 16 hours ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers