Staff AI Engineer - MCP Services

12 hours, 12 minutes ago
Full-time
Lead
Software Development
Datadog

Datadog

Datadog is the go-to monitoring platform for cloud applications, providing a complete suite of services to ensure optimal performance and user satisfaction.

IT Services
5K-10K
Founded 2010

Description

  • Lead improvements to Datadog’s public-facing MCP server so intelligent agents can discover and interact with Datadog services.
  • Design and implement agentic tool surfaces for both evaluation and production use across multiple AI agents.
  • Build and maintain evaluation pipelines to measure agent performance on Datadog workflows such as investigations, incident triage, and metric queries.
  • Investigate and resolve failure cases by analyzing tool output, improving query parsing, and strengthening agent feedback mechanisms.
  • Collaborate with Applied AI and internal teams to define shared standards for tool integration and data access.

Requirements

  • Staff-level engineering experience with a strong background in applied AI, agentic programming, or related LLM automation work.
  • Experience with LLM orchestration frameworks such as LangChain, LangGraph, or CrewAI, or with agent orchestration and tool-use systems.
  • Experience building evaluation frameworks for LLM agents or AI systems, including metrics design and data instrumentation.
  • Comfort working in high-ambiguity, fast-changing environments and independently defining and prioritizing direction.
  • Strong systems thinking and the ability to reason across multiple agents, tools, and user scenarios.
  • Demonstrated ability to use AI coding tools in day-to-day workflows and to validate, critique, and refine AI-generated output.
  • Bonus: Experience with the MCP standard or with building agent-compatible tooling surfaces.
  • Bonus: Familiarity with building and evaluating ReAct agentic loops.
  • Passion for advancing agent-augmented software and AI-enabled products.

Benefits

  • Generous and competitive benefits package.
  • New hire stock equity (RSUs) and an employee stock purchase plan.
  • Continuous career development and pathing opportunities.
  • Employee-focused, best-in-class onboarding.
  • Internal mentor and cross-departmental buddy program.
  • Hybrid work environment with a focus on work-life harmony.
  • Friendly and inclusive workplace culture.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Software Engineer, Protect

SoFi 1K-5K Capital Markets

SoFi is hiring a Senior Software Engineer for its Protect team to help build a next-generation insurance platform and shape the technical direction of a greenfield, high-impact business area.

AWS CI/CD Docker DynamoDB Git Java JavaScript Kafka Kotlin Kubernetes LLM Microservices PostgreSQL React Spring TypeScript
10 hours, 12 minutes ago

Software Engineer, Developer (Wallets and Onchain Tools)

Coinbase 1K-5K Capital Markets

Coinbase is hiring a software engineer for its CDP Wallets & Onchain Tools team to build developer-focused APIs, SDKs, and documentation that help accelerate crypto application development onchain.

Android Encryption Flutter GitHub Go iOS Microservices OpenAPI React React Native Solana TypeScript Unity
10 hours, 57 minutes ago

Software Engineer II

Veracyte 251-1K Pharmaceuticals

Veracyte is hiring a cloud engineering and application development professional for its Bioinformatics & Data Science Development team to build scalable cloud-native applications that support cancer diagnostics products and productionize research workflows.

Agile AWS AWS CDK CloudFormation Docker EC2 Kubernetes Machine Learning Microservices Node.js Python React REST API Scrum SQL Terraform Vue.js
10 hours, 57 minutes ago

Staff Software Engineer, C021 Security

Cribl 251-1K IT Services

Cribl is hiring a Staff Engineer for its C021 new product initiative to help design and build an emerging data platform that processes large volumes of streaming data in a fully remote, remote-first environment.

Apache Spark AWS Azure Docker Druid Flink GCP JavaScript Kafka Kubernetes Linux LLM Node.js
10 hours, 57 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers