Senior AI Inference Engineer

1 week ago
Senior
DevOps and Infrastructure
Media.Monks

Media.Monks

Media.Monks is a disruptive global powerhouse connecting content, data, and digital media services to lead brands into a new digital era.

Media
5K-10K
Founded 2001

Description

  • Architect, implement, and optimize end-to-end AI inference services and agentic pipelines in Python.
  • Design autonomous agents that can interpret, reason about, and act on video and multi-modal content.
  • Integrate Vision Language Models into robust, production-grade workflows.
  • Use LLM and agent orchestration frameworks to coordinate complex visual AI tasks.
  • Deploy and operate services on Kubernetes and related infrastructure, ensuring reliability and scalability under heavy media workloads.
  • Architect distributed systems on AWS with trade-offs across performance, cost, and resilience.
  • Optimize workloads for modern NVIDIA GPU architectures for real-time and high-throughput media use cases.
  • Collaborate with clients on pre-sales discussions to validate feasibility, shape solutions, and clarify requirements.
  • Create clear architecture diagrams and technical documentation for technical and non-technical stakeholders.
  • Provide technical leadership to project teams to ensure implementation aligns with the intended architecture and product value.

Requirements

  • Significant senior-level experience building and shipping AI/ML systems in production.
  • Strong Python experience and familiarity with a modern data/ML stack.
  • Proven experience taking models from notebooks or prototypes into low-latency inference services.
  • Hands-on experience building agentic systems, especially with computer vision or multi-modal inputs.
  • Experience architecting autonomous agents that can analyze video content.
  • Experience integrating Vision Language Models such as GPT-4o, Gemini Pro Vision, or LLaVA.
  • Familiarity with LLM/agent orchestration frameworks such as LangGraph, AutoGen, or Semantic Kernel.
  • Strong practical experience with Kubernetes in production.
  • Experience architecting distributed systems on AWS beyond basic instance provisioning.
  • Understanding of modern NVIDIA GPU architectures such as Ampere, Hopper, or Blackwell.
  • Product-minded and able to align technical decisions with business outcomes and ROI.
  • Excellent communication skills and comfort in client-facing and pre-sales conversations.
  • Self-starter who thrives in ambiguity and enjoys reading source code.
  • Nice to have: experience with FFmpeg, GStreamer, NVENC/NVDEC, modern codecs, OpenShift, NVIDIA Holoscan, Mojo, or deploying AI systems on edge or hybrid/on-prem environments.
  • Remote location within North or South America.

Benefits

  • Remote within North or South America.
  • Opportunity to work on cutting-edge AI systems for major organizations in media, entertainment, gaming, and sport.
  • Exposure to client-facing work, including pre-sales and solution shaping.
  • Work on challenging real-time AI and video inference problems with modern GPU and cloud infrastructure.
  • Inclusive, equity-focused hiring and work environment.
  • Equal-opportunity employer committed to diversity and inclusion.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Sr. Software Engineer II, Machine Learning

Narvar 251-1K Media

Narvar is hiring a Senior Software Engineer II to build and improve data-driven machine learning products for its post-purchase platform, with work that impacts retailers, business partners, and millions of consumers.

Apache Spark GCP Linux Machine Learning NLP NumPy Pandas Python PyTorch Shell Scripting SQL Statistics TensorFlow
4 minutes ago

Senior Machine Learning Engineer

airSlate 251-1K Professional Services

airSlate is seeking a Senior Machine Learning Engineer to develop and deploy ML and AI solutions that support high-impact marketing, SEO, and customer value initiatives at global scale.

AWS BERT Deep Learning Feature Engineering GPT LLM Machine Learning Python Reinforcement Learning SageMaker SEO
4 hours, 49 minutes ago

AI Data Engineer

Influur 11-50 Media

Influur is hiring an AI Data Engineer in New York/remote to own the full data-to-agent pipeline behind its autonomous viral marketing system for influencer campaigns.

AWS GCP LLM Python
5 hours, 4 minutes ago

Senior Backend Engineer (Golang), AI Chat

Binance 5K-10K Capital Markets

Binance is hiring a Senior Backend Engineer (Golang) to build backend logic for its user-facing AI Chat product within its global blockchain ecosystem.

Go Microservices Python
5 hours, 4 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers