L3 Hardware Support Lead

2 hours, 16 minutes ago
Full-time
Lead
Customer and Technical Support
Nebius

Nebius

Nebius enables B2B companies to build local hyperscaling cloud platforms with cost-effective GPUs, InfiniBand network, and 50% less compute cost. They offer managed Kubernetes and a launch-ready business model for innovative cloud solutions.

Internet Software & Services
51-250

Description

  • Build and lead the L3 and escalation support function for datacenter server infrastructure across multiple regions.
  • Serve as Incident Commander for high-severity production incidents, coordinating mitigation and communication.
  • Own incident response, problem management, and cross-team escalation workflows end to end.
  • Support enterprise bare metal customers under contractual SLAs, including executive-level stakeholder communication.
  • Drive root cause analysis for hardware, firmware, and platform-level failures and define corrective actions.
  • Manage vendor escalations with ODMs and OEMs through formal support channels and direct engagement.
  • Partner with datacenter operations, hardware engineering, and infrastructure teams to improve reliability at fleet scale.
  • Establish KPIs, escalation standards, and operational playbooks for production hardware support.
  • Hire, coach, and scale a high-performing support engineering team.
  • Continuously improve response times, incident quality, and customer experience.

Requirements

  • Experience building or leading an L3 and escalation support function for datacenter server infrastructure in distributed, multi-region environments.
  • Experience supporting enterprise bare metal customers under contractual SLAs.
  • Strong incident management leadership experience, including serving as Incident Commander.
  • Proven ability to build and formalize incident response, problem management, and cross-team escalation processes from scratch.
  • People management experience, including hiring, coaching, and performance management.
  • Strong English communication skills, written and verbal.
  • Deep troubleshooting capability across Linux, server hardware, and firmware (BIOS/BMC), with ability to guide investigations at a systems engineer level (preferred).
  • Strong familiarity with GPU server platforms and common diagnostics such as nvidia-smi, dcgmi, and Linux log correlation (preferred).
  • Experience driving ODM and OEM vendor escalations through support portals and direct channels (preferred).
  • Scripting skills in bash and basic Python for troubleshooting and lightweight analytics (preferred).
  • Exposure to OCP-based hardware platforms (preferred).
  • Remote work within the United States; full-time position.

Benefits

  • Health insurance.
  • 401(k) plan.
  • Paid time off.
  • Sick leave.
  • Competitive salary of $125k–$180k base plus quarterly performance bonuses.
  • Comprehensive benefits package.
  • Flexible working arrangements.
  • Opportunities for professional growth within Nebius.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Manager, Strategic Accounts

Angi 1K-5K Professional Services

Angi is seeking a Senior Manager, Strategic Accounts to lead a team supporting its largest home service professional partners, with a focus on growing revenue, strengthening strategic relationships, and improving operational performance.

Looker Tableau
1 minute ago

TMF Operations Manager

Definium Therapeutics 51-250 Health Care Providers & Services

Definium Therapeutics is seeking a TMF Operations Manager to oversee Trial Master File documentation and inspection readiness across multiple global clinical studies.

1 minute ago

Senior Production Support Engineer - EU / UK

Marqeta 251-1K Diversified Financial Services

Marqeta is seeking a Production Support Engineer in Europe to support customer integrations, production operations, and incident resolution for its card issuing platform.

Confluence Datadog Grafana HTTP JIRA Kibana Linux MySQL New Relic PostgreSQL Python Ruby Salesforce Splunk SQL
1 minute ago

Software Asset Management Lead

JetBrains 1K-5K Internet Software & Services

JetBrains’ Internal IT team is hiring a Software Asset Management specialist to own the lifecycle of third-party software assets and improve how the company manages internal software procurement, licensing, and usage worldwide.

Figma Notion
1 minute ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers