Tekion

Tekion

Tekion is a leading provider of cloud-native automotive platforms that unify DMS, CRM, Digital Retail, Analytics, and more. Their AI-powered software enables personalized selling, upsell, and cross-sell opportunities, driving revenue and profitability....

IT Services
1K-5K
Founded 2016
$435M raised

Description

  • Build and operate the LLM control plane and gateway, including smart routing, rate limits, quotas, failover, and token and cost tracking.
  • Ship a unified API and SDKs using REST and gRPC with normalized schemas, structured outputs, caching, and full observability.
  • Enforce safety and privacy controls such as content filtering, prompt and response validation, and PII redaction.
  • Enable multi-model, multi-vendor LLM usage with automated canarying and versioning.
  • Own the agent runtime, including the tool registry, permissions, function calling, grounding, and retrieval.
  • Design orchestration patterns for agents and manage agent state and long-running workflows.
  • Build platform components for classical ML training and scoring pipelines, experiment tracking, and model packaging.
  • Monitor model and data drift and retrain or tune models to maintain accuracy and relevance.
  • Add human-in-the-loop review and safe actioning before agents interact with dealer systems.
  • Evolve the domain graph, entity resolution, and reliable data ingestion pipelines to serve real-time context to agents.
  • Define and maintain SLOs for latency, uptime, and cost, and enable autoscaling and spend controls.
  • Maintain a model and agent registry with versioning, approvals, audit trails, reproducibility, and compliance support.
  • Provide templates, CLIs, sandboxes, and documentation to help product teams build and ship quickly, while mentoring engineers on MLOps and AI safety.

Requirements

  • 12–15+ years building large-scale data, ML, or platform systems.
  • Strong software engineering fundamentals in API design, concurrency, and distributed systems.
  • Production experience with Python and one of Java, Scala, or Go.
  • Experience with microservices and API design.
  • Experience with MLOps at scale, including Airflow or Kubeflow, MLflow, CI/CD for models, A/B testing, shadow or canary releases, and online feature computation with Spark, Flink, or Kafka.
  • Experience with AWS is preferred, along with Docker and Kubernetes.
  • Practical ML knowledge including feature engineering, training, evaluation, and drift detection.
  • Experience deploying models that power user-facing workflows.
  • Experience building or operating an LLM gateway or control plane with provider adapters, routing policies, caching, quotas, rate limits, and cost and token accounting.
  • Experience with agentic systems, including tool use, function calling, orchestration frameworks, human-in-the-loop review, safety guardrails, and online evaluation or telemetry.
  • Experience with graph and retrieval technologies such as Neo4j, Neptune, TigerGraph, GraphQL, pgvector, Qdrant, or Milvus.
  • A platform-as-product mindset and ability to think in systems, with observability, fallback, and access control as core concerns.
  • Passion for AI and real-world LLM and agentic use cases.
  • A cost-aware approach that treats latency and dollars as first-class metrics.
  • Vendor-agnostic thinking and a focus on portability and resilience.
  • Strong documentation and teaching skills that help teams understand complex systems.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Software Engineer, Windows/Desktop Applications - Tallahassee, FL, USA

Speechify 51-250 Internet Software & Services

Speechify is hiring a Windows Desktop Application Engineer to lead the architecture and development of its accessibility-focused text-to-speech products for millions of users in a fully distributed environment.

C# C++ CI/CD .NET
28 minutes ago

Senior Software Engineer - .NET Core/AWS

3Pillar Global 1K-5K Internet Software & Services

3Pillar Global is hiring a Senior Software Engineer to develop and guide product solutions for enterprise clients on a remote, full-time team using .NET Core and AWS.

Agile AWS C# Docker Git Java Kubernetes Microservices MongoDB .NET OWASP SQL Server
32 minutes ago

Software Engineer, Data Infrastructure & Acquisition - Virginia Beach, VA, USA

Speechify 51-250 Internet Software & Services

Speechify is hiring a Software Engineer for its AI team to build and operate the data ingestion infrastructure that collects large-scale audio datasets for model training and next-generation products.

Android Bash Docker GCP iOS Linux Python Terraform
48 minutes ago

Software Engineer, Data Infrastructure & Acquisition - Reykjavik, Iceland

Speechify 51-250 Internet Software & Services

Speechify is hiring a Software Engineer for its AI team to own data collection and ingestion infrastructure that supports training high-quality models at petabyte scale.

Bash Docker GCP Linux Python Terraform
49 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers