RUNWARE

RUNWARE

RUNWARE provides an affordable API that enables AI developers to efficiently run image, video, and custom generative AI models without the need for extensive infrastructure or machine learning expertise.

Internet Software & Services
1-10
Founded 2023

Description

  • Build and scale infrastructure for real-time AI inference across GPU fleets, bare-metal servers, and containerized production systems.
  • Help evolve the platform toward more elastic, on-demand infrastructure that can respond to customer traffic and model demand.
  • Improve the performance, reliability, and resilience of request entrypoints, inference services, queues, storage, load balancers, and networking.
  • Automate infrastructure operations, including provisioning, configuration, CI/CD, deployment safety, progressive rollouts, and rapid rollback.
  • Build and maintain the observability stack needed to detect issues early, understand capacity, and resolve problems before they affect customers.
  • Lead production operations, incident response, debugging, and post-incident improvements.
  • Strengthen infrastructure security and compliance through patching, secrets management, access controls, hardening, auditability, and documentation.

Requirements

  • Strong experience as a DevOps Engineer, SRE, Infrastructure Engineer, Platform Engineer, or in a similar role running production systems at scale.
  • Deep Linux knowledge and confidence debugging real production issues across networking, storage, performance, services, and system behavior.
  • Hands-on experience building automation, Infrastructure-as-Code, CI/CD pipelines, and deployment workflows.
  • Experience operating high-availability, low-latency, or high-throughput platforms where reliability and performance directly affect customers.
  • Strong networking fundamentals across TCP/IP, DNS, load balancing, routing, firewalls, proxies, TLS, and HTTP.
  • A calm and pragmatic approach under pressure, with strong communication, good judgment, and a bias toward automation over manual toil.
  • Experience operating GPU infrastructure for AI/ML inference, including NVIDIA drivers, CUDA, container runtimes, GPU monitoring, capacity planning, and workload isolation (bonus).
  • Familiarity with inference serving and optimization frameworks such as vLLM, TensorRT, Triton, or similar (bonus).

Benefits

  • Remote-first work environment with the option to work from home anywhere they can employ you.
  • Flexible hours outside core collaboration blocks.
  • Generous paid time off, including vacation, sick days, and public holidays.
  • Meaningful stock options.
  • Paid family leave, including maternity, paternity, and caregiver time.
  • Twice-yearly company retreats in inspiring locations.
  • Built-in downtime after major release cycles to unplug and recharge.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Principal DevOps Engineer (Remote)

LegalMatch 251-1K Specialized Consumer Services

Principal DevOps Engineer at a company focused on evolving infrastructure, tools, and DevOps practices to support reliable, scalable systems and measurable business outcomes.

AMQP AWS Bash Bitbucket CDN CI/CD Cloudflare Datadog DNS Docker Git GitOps Jenkins Kafka Linux PowerShell RabbitMQ Terraform WAF
1 hour, 16 minutes ago

Binance Accelerator Program - Software Engineer (DevOps & Data)

Binance 5K-10K Capital Markets

Binance is hiring an early-career Software Engineer in its Accelerator Program to support DevOps and data-related infrastructure work for its global blockchain ecosystem.

AWS CI/CD Docker Git Java JSON Kubernetes MySQL Pandas PostgreSQL Python REST API Selenium Spring Boot SQL
1 hour, 16 minutes ago

Senior CI / Build Systems Engineer

OURA 251-1K Health Care Providers & Services

Oura is hiring a Senior CI & Build Systems Engineer to own and evolve the mobile CI, build, test, and developer observability systems that support iOS and Android engineering at scale.

Android AWS Bash CI/CD Git GitHub Actions Gradle iOS Linux macOS Python Xcode
6 hours, 4 minutes ago

Senior Software Engineer - Engineering Workflow and CI

Mozilla 251-1K Internet Software & Services

Mozilla is hiring a developer to improve Firefox development workflows and support the tools teams use to build, test, and deliver core products.

Android AWS CI/CD CircleCI Django Docker FastAPI Flask GCP Git GitHub Actions JavaScript Linux macOS Python REST API SQL
9 hours, 9 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers