Tekion

Tekion

Tekion is a leading provider of cloud-native automotive platforms that unify DMS, CRM, Digital Retail, Analytics, and more. Their AI-powered software enables personalized selling, upsell, and cross-sell opportunities, driving revenue and profitability....

IT Services
1K-5K
Founded 2016
$435M raised

Description

  • Build and operate the LLM control plane and gateway, including smart routing, rate limits, failover, and token/cost tracking.
  • Ship unified APIs and SDKs with normalized schemas, structured outputs, caching, and full observability across traces, logs, and metrics.
  • Enforce safety and privacy controls such as content filtering, prompt and response validation, and PII redaction.
  • Enable multi-model and multi-vendor LLM usage with automated canarying and versioning.
  • Own the agent runtime, including tool registry, permissions, function calling, grounding, and retrieval.
  • Design agent orchestration patterns and manage agent state and long-running workflows.
  • Build platform components for classical ML training and scoring pipelines, experiment tracking, and model packaging.
  • Monitor model and data drift, and retrain or tune models to maintain accuracy and relevance.
  • Add human-in-the-loop review and safe actioning before agents interact with dealer systems.
  • Evolve the domain graph, entity resolution, and data ingestion pipelines to serve real-time context with access controls and lineage.
  • Implement hybrid retrieval using graph, vector, and keyword search with smart caching to balance accuracy, latency, and cost.
  • Define and manage SLOs for latency, uptime, and cost, while enabling autoscaling and spend controls.
  • Maintain model and agent registries with versioning, approvals, audit trails, reproducibility, and compliance support.
  • Provide templates, CLIs, sandboxes, and documentation to help product teams ship quickly, and mentor engineers on MLOps and AI safety best practices.

Requirements

  • 5+ years building large-scale data, ML, or platform systems.
  • Strong software engineering fundamentals in API design, concurrency, and distributed systems.
  • Production experience with Python and one of Java, Scala, or Go.
  • Experience with microservices and API design.
  • Experience with MLOps at scale, including Airflow or Kubeflow, MLflow, CI/CD for models, A/B testing, shadow or canary deployments, and online feature computation with Spark, Flink, or Kafka.
  • Experience with cloud and containers, especially AWS, Docker, and Kubernetes.
  • Practical ML knowledge covering feature engineering, training, evaluation, and drift detection.
  • Experience deploying models that power user-facing workflows.
  • Experience building or operating an LLM gateway or control plane, including provider adapters, routing policies, caching, quotas, rate limits, and cost/token accounting.
  • Experience with agentic systems, including tool use or function calling, orchestration frameworks, human-in-the-loop workflows, safety guardrails, and online evaluation or telemetry.
  • Experience with graph and retrieval systems, such as Neo4j, Neptune, TigerGraph, GraphQL, pgvector, Qdrant, or Milvus.
  • Preferred mindset includes platform-as-product thinking, strong observability and access-control habits, cost awareness, vendor-agnostic design, and the ability to document and teach complex systems.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Software Engineer, Auto

Upstart 1K-5K Banks

Upstart is hiring a Senior Software Engineer to help scale its Auto Direct secured lending product by building customer-facing experiences, eligibility systems, and operational workflows in a digital-first environment.

System Design
12 hours, 14 minutes ago

Software Engineer III

6sense 1K-5K IT Services

6sense is hiring a Software Engineer III to design, develop, and scale backend services and distributed systems for its AI-driven B2B account engagement platform.

AWS Azure GCP Go Java Microservices Python System Design TypeScript
12 hours, 14 minutes ago

Senior Lead Software Engineer - Developer Infrastructure

Klaviyo 1K-5K IT Services

Klaviyo is hiring a Senior Lead Software Engineer to lead backend Dev Infrastructure architecture and platform strategy for dependencies, upgrades, and developer productivity across the engineering organization.

Apache Airflow Apache Spark AWS Azure Buildkite ClickHouse Django Docker FastAPI GCP Go Jest Kafka Kubernetes MySQL PostgreSQL Python RabbitMQ React Redis Terraform TypeScript
12 hours, 14 minutes ago

Principal Software Engineer

Natera 1K-5K Pharmaceuticals

Natera is hiring a Principal Software Engineer for its R&D Platform Infrastructure team to lead architecture and delivery of cloud, workflow, and data platforms that support scientific workloads.

Apache Airflow AWS Azure Dagster Django GCP Go Groovy Helm Java Kubernetes Python React Terraform
12 hours, 14 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers