SWORD Health

SWORD Health

SWORD Health provides AI-powered digital physical therapy solutions designed to prevent pain, support recovery, and enhance overall health, while also aiming to transform the rehabilitation industry through innovative technology and clinical oversight.

Health Care Providers & Services
251-1K
Founded 2015
$324M raised

Description

  • Design, build, and maintain inference infrastructure for AI products with high throughput, low latency, and cost efficiency.
  • Own the end-to-end deployment pipeline for AI models, from computer vision to large language models.
  • Architect and scale Kubernetes clusters for GPU-accelerated workloads, including autoscaling and resource scheduling.
  • Build and operate infrastructure for real-time AI agents, including WebRTC cluster provisioning and low-latency speech services.
  • Drive inference scaling strategies such as speculative decoding, continuous batching, and model parallelism.
  • Develop and maintain Infrastructure as Code and GitOps workflows for GPU-enabled environments.
  • Instrument and monitor inference systems for GPU utilization, model latency, throughput, and error rates.
  • Collaborate with ML Engineers, Data Scientists, and Product teams to turn model requirements into production-ready infrastructure.
  • Evaluate emerging AI infrastructure tools, frameworks, and hardware to improve performance and efficiency.
  • Mentor team members on AI infrastructure best practices and production ML systems.

Requirements

  • 5+ years of infrastructure engineering experience, including at least 2 years focused on AI/ML workloads in production.
  • Strong Kubernetes experience for GPU-accelerated workloads, including scheduling, resource management, and autoscaling.
  • Hands-on experience with model serving and inference optimization for computer vision and large language model workloads.
  • Solid understanding of LLM inference optimization techniques, including speculative decoding, batching, quantization, and model parallelism.
  • Experience provisioning and managing infrastructure for real-time AI systems, including WebRTC clusters and AI agent architectures.
  • Familiarity with real-time video/computer vision inference pipelines and low-latency continuous data processing.
  • Familiarity with speech-to-text and text-to-speech serving infrastructure for low-latency voice AI.
  • Experience with Infrastructure as Code tools such as Terraform and with GitOps methodologies.
  • Working knowledge of GPU infrastructure, including the NVIDIA CUDA ecosystem, multi-GPU setups, and GPU monitoring/profiling.
  • Strong Linux systems and networking fundamentals for latency-sensitive workloads.
  • Fluent in English, both written and oral.
  • Proactive, ownership-driven mindset with the ability to identify and resolve inference bottlenecks early.
  • Experience with LLM serving engines such as vLLM, SGLang, or LLM-D (preferred).
  • Experience with NVIDIA Triton Inference Server and TensorRT for real-time computer vision workloads (preferred).
  • Familiarity with NVIDIA Riva or similar STT/TTS platforms (preferred).
  • Experience with Istio or similar service mesh tools (preferred).
  • Experience with Kafka for event streaming (preferred).
  • Experience with Prometheus, AlertManager, and Grafana for observability (preferred).
  • Experience with Elasticsearch, Logstash, and Kibana for log management (preferred).
  • Experience with Vault for secrets management (preferred).
  • Experience with Redis, MySQL, and DNS management (preferred).
  • Experience provisioning infrastructure on AWS, Azure, or GCP (preferred).
  • Good knowledge of cloud networking, including VPCs, routing, NAT, and troubleshooting with tools like TCPdump (preferred).
  • Experience with WebRTC infrastructure and real-time media streaming (preferred).
  • Experience with Python, Go, or similar languages used in ML infrastructure tooling (preferred).
  • Familiarity with SCRUM methodology (preferred).

Benefits

  • €66,500 - €104,500 annual salary range, including base, variable, and equity.
  • Competitive compensation with potential bonus and stock option value.
  • Flexible remote or hybrid work policy.
  • Unlimited vacation and the ability to control your working hours remotely.
  • Health and well-being program, including digital therapist sessions.
  • Career development and growth opportunities.
  • Opportunity to work with a talented team on an innovative healthcare solution.
  • Fast-paced, stimulating environment with room for creativity.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Principal Architect - Infrastructure

Aera Technology 251-1K Internet Software & Services

Aera Technology is hiring a Principal Architect, Infrastructure to design and operate the multi-cloud foundation for its AI-powered Decision Intelligence platform, with a focus on scalability, reliability, security, and global performance.

Argo CD Azure GitHub Actions GitOps Grafana Helm Kubernetes Machine Learning MySQL OpenTelemetry Prometheus Python Ruby Terraform
1 hour ago

AI Data Engineer

Influur 11-50 Media

Influur is hiring an AI Data Engineer in New York/remote to own the full data-to-agent pipeline behind its autonomous viral marketing system for influencer campaigns.

AWS GCP LLM Python
1 hour, 30 minutes ago

Senior Backend Engineer (Golang), AI Chat

Binance 5K-10K Capital Markets

Binance is hiring a Senior Backend Engineer (Golang) to build backend logic for its user-facing AI Chat product within its global blockchain ecosystem.

Go Microservices Python
1 hour, 30 minutes ago

Infrastructure Software Engineer

Mechanical Orchard 11-50 Internet Software & Services

Mechanical Orchard is hiring a remote Infrastructure Software Engineer in Canada to help build and operate infrastructure for its Generative AI platform, Imogen, as it is deployed to customer cloud environments.

Agile Bash CI/CD DevSecOps Docker Generative AI Go Helm Kubernetes LLM Terraform
1 hour, 30 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers