Senior Site Reliability Engineer (DevTools)

25 minutes ago
Full-time
Senior
DevOps and Infrastructure
Nebius

Nebius

Nebius enables B2B companies to build local hyperscaling cloud platforms with cost-effective GPUs, InfiniBand network, and 50% less compute cost. They offer managed Kubernetes and a launch-ready business model for innovative cloud solutions.

Internet Software & Services
51-250

Description

  • Improve services based on user feedback and user problems.
  • Build fault-tolerant, self-healing architecture.
  • Identify ways to speed up systems and reduce user friction.
  • Modify and extend closed-source and open-source solutions, including GitLab and TeamCity plugins.
  • Support users and help resolve their requests and issues.
  • Define metrics that measure user problems and verify that fixes actually resolve them.
  • Work with large-scale build, artifact, and monorepo systems in a production environment.

Requirements

  • Experience combining SRE and software engineering work in roughly a 50/50 split.
  • Experience with Java, Kotlin, Go, Python, and/or Ruby.
  • Understanding of Unix-like systems and the JVM under the hood.
  • Strong focus on improving user experience.
  • Ability to adapt quickly in a fast-changing environment.
  • Experience in Platform Engineering is a plus.
  • Experience operating GitLab or another version control system is a plus.
  • Experience operating TeamCity or another CI system is a plus.
  • Experience with Spring and operating Java monoliths is a plus.
  • Coding interview participation is part of the hiring process.
  • Must be authorized to work in the country of application and provide proof of employment eligibility.

Benefits

  • Competitive compensation.
  • Career growth and learning opportunities.
  • Flexibility and work-life balance.
  • Collaborative and innovative culture.
  • Opportunity to work on impactful AI projects.
  • International environment with talented teams.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer (SRE)

The Investigo Group Professional Services

The Investigo Group is hiring a Senior Site Reliability Engineer to operate and mature its production Kubernetes and OpenShift platforms across secure on-premises and hybrid environments.

Ansible Argo CD CI/CD Flux GitHub Actions GitOps Go Grafana Helm Juniper Kubernetes Linux Load Balancing Machine Learning OpenID Connect OpenShift OpenTelemetry Palo Alto Prometheus Python SAML Shell Scripting Terraform
5 hours, 21 minutes ago

Staff Site Reliability Engineer, Production Engineering

Dropbox 1K-5K Internet Software & Services

Dropbox is hiring a Site Reliability Engineer to define and drive company-wide reliability strategy for an AI-enabled engineering environment, with the goal of strengthening stability, observability, incident response, and operational excellence at scale.

5 hours, 29 minutes ago

Senior Cloud Resilience Architect

Blink Health 251-1K Health Care Providers & Services

Blink Health is hiring a disaster recovery and resilience architecture leader to strengthen the reliability of its healthcare technology platforms and critical patient-facing systems.

Ansible AWS Azure CloudFormation DNS GCP Kubernetes Load Balancing Pulumi Terraform
5 hours, 42 minutes ago

DevOps / Site Reliability Engineer

OKX 1K-5K Diversified Financial Services

OKX is hiring a DevOps/SRE-focused engineer in Singapore to build and operate AIOps, monitoring, FinOps, and cloud security infrastructure across its multi-cloud environment.

AWS GitLab GitLab CI Go Grafana Java Prometheus Python React Vue.js
6 hours, 57 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers