Staff Software Engineer - Serverless

1 month, 1 week ago
Full-time
Lead
Software Development
RUNWARE

RUNWARE

RUNWARE provides an affordable API that enables AI developers to efficiently run image, video, and custom generative AI models without the need for extensive infrastructure or machine learning expertise.

Internet Software & Services
1-10
Founded 2023

Description

  • Build the core systems behind Runware’s serverless platform, including workload execution, routing, scheduling, isolation, and scaling.
  • Design a simple SDK-based experience that lets developers deploy models and run AI workloads without managing infrastructure.
  • Design and improve the control plane for serverless execution, including APIs, workers, lifecycle management, retries, and failure handling.
  • Work with infrastructure and ML teams to improve startup time, GPU utilization, model warm-up, caching, and placement.
  • Build observability tools that make serverless workloads easy to monitor, debug, and operate globally at scale.
  • Lead technical design for a new product area and help define engineering standards.
  • Mentor other engineers and provide technical leadership across the team.
  • Collaborate closely with senior leadership to bring the new product line to market.

Requirements

  • Strong experience as a Staff Engineer, Senior Software Engineer, Backend Engineer, Platform Engineer, or similar.
  • Experience building backend services, distributed systems, developer platforms, or workload orchestration systems.
  • Strong understanding of async processing, queues, scheduling, retries, back pressure, and failure handling.
  • Comfort working across APIs, control planes, workers, databases, and observability systems.
  • Strong engineering fundamentals in one or more backend languages such as Python, Go, or similar.
  • Good judgment around trade-offs between reliability, latency, scale, cost, and developer experience.
  • Clear communication, strong ownership, and the ability to lead technical direction in a fast-moving environment.
  • Experience building serverless platforms, job execution systems, container platforms, or compute orchestration systems is a nice to have.
  • Experience with GPU-backed workloads, AI/ML inference, model serving, batch processing, or high-performance compute is a nice to have.
  • Familiarity with technologies such as vLLM, TensorRT, Triton, Kubernetes, Nomad, or Knative is a nice to have.
  • Experience improving workload performance through batching, autoscaling, model warm-up, caching, request routing, or queue management is a nice to have.
  • Experience with multi-tenant isolation, sandboxing, quotas, rate limits, resource accounting, or usage-based billing is a nice to have.

Benefits

  • Remote-first setup with the ability to work from home anywhere the company can employ you.
  • Flexible hours with core collaboration blocks and control over your schedule outside them.
  • Generous paid time off, including vacation, sick days, and public holidays.
  • Meaningful stock options that let you share in the upside you help create.
  • Paid family leave for maternity, paternity, and caregiver time.
  • Company retreats twice a year in inspiring locations.
  • Built-in downtime after major release pushes to rest and recharge.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Java Engineer - Distributed Systems - Elasticsearch

Elastic 1K-5K Internet Software & Services

Elastic is hiring a Senior Software Engineer for the Elasticsearch Distributed Systems team to improve cluster-scale indexing, coordination, and resilience across a highly distributed search platform.

Elasticsearch Java Lucene
9 hours, 34 minutes ago

Senior Java Engineer - Distributed Systems - Elasticsearch

Elastic 1K-5K Internet Software & Services

Elastic is hiring a Senior Software Engineer for its Elasticsearch Distributed Systems team to improve the scale, performance, and resilience of clustered search infrastructure.

Elasticsearch Java Lucene
9 hours, 34 minutes ago

Senior Java Engineer - Distributed Systems - Elasticsearch

Elastic 1K-5K Internet Software & Services

Elastic is hiring a Senior Software Engineer for its Elasticsearch Distributed Systems team to help improve the scale, performance, and resilience of the cluster systems that handle indexing, allocation, replication, and node coordination.

Elasticsearch Java Lucene
9 hours, 34 minutes ago

Senior Software Engineer - Fullstack (Backend Focused)

New Relic 1K-5K Internet Software & Services

New Relic is hiring a backend engineer to help build a new observability experience and next-generation platform services for distributed systems in an AI-first environment.

Agile CI/CD Docker Git GraphQL Java Kafka Kubernetes Microservices React REST API Spring Boot TypeScript
9 hours, 34 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers