NetSRE

3 days, 4 hours ago
Mid Level
Software Development
Nebius

Nebius

Nebius enables B2B companies to build local hyperscaling cloud platforms with cost-effective GPUs, InfiniBand network, and 50% less compute cost. They offer managed Kubernetes and a launch-ready business model for innovative cloud solutions.

Internet Software & Services
51-250

Description

  • Ensure fault tolerance, scalability, and uninterrupted operation of infrastructure services.
  • Use modern technologies to solve infrastructure and operational problems.
  • Implement and improve CI/CD processes.
  • Support systems used for functional and load testing.
  • Monitor engineering equipment in data centers, including power supply, air cooling, and water cooling systems.
  • Monitor IT equipment such as racks, servers, JBODs, JBOGs, power shelves, and network devices.
  • Track assets and hardware repair tasks.
  • Support server production activities.

Requirements

  • Proficiency in Linux systems.
  • Strong Python and Bash scripting skills for automation.
  • Demonstrated ability to troubleshoot complex hardware, software, and networking issues.
  • Strong analytical and problem-solving skills focused on system performance optimization.
  • Working proficiency in English.
  • Experience designing, developing, and running high-load distributed systems is a plus.
  • Interest in backend development is a plus.
  • Applicants must be authorized to work in the country where they apply and provide proof of employment eligibility.

Benefits

  • Competitive compensation.
  • Career growth and learning opportunities.
  • Flexibility and work-life balance.
  • Collaborative and innovative culture.
  • Opportunity to work on impactful AI projects.
  • International environment with talented teams.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer

Cribl 251-1K IT Services

Cribl is hiring a Senior Site Reliability Engineer in Poland to help build and operate the telemetry infrastructure and observability platform that supports its cloud products and enterprise customers.

Ansible AWS Azure CI/CD Grafana JavaScript Kibana Linux New Relic Node.js PagerDuty Prometheus Splunk Terraform TypeScript
1 hour, 47 minutes ago

Site Reliability Engineer

Kaseya 1K-5K IT Services

Kaseya is hiring a Site Reliability Engineer to own the reliability, automation, and production stability of its AWS-based services used by thousands of MSPs worldwide.

Ansible AWS Chef CloudFormation Datadog DevSecOps Elasticsearch Kibana Kubernetes MySQL PostgreSQL Puppet Secrets Management Serverless Terraform
8 hours, 6 minutes ago

Senior Site Reliability Engineer (Remote - Brazil)

Loadsmart 251-1K Air Freight & Logistics

Loadsmart is hiring a Senior Site Reliability Engineer in Brazil to build and maintain its internal platform and ensure the reliability, safety, and operational excellence of critical engineering systems.

Ansible AWS Bash Chef CI/CD Docker Kubernetes PostgreSQL Python Terraform
8 hours, 27 minutes ago

Site Reliability Engineering Manager

RapidSOS 51-250 Diversified Telecommunication Services

RapidSOS is seeking an SRE Manager to lead its SRE Operations team and own the reliability of critical cloud infrastructure that supports real-time emergency response.

Argo CD AWS Datadog GitHub Actions Helm Kubernetes Python RabbitMQ Terraform
1 day, 3 hours ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers