Senior AI Engineer - Grafana Ops, AI/ML | USA | Remote

4 hours, 8 minutes ago
Full-time
Senior
Software Development
Grafana

Grafana

Grafana is the open observability platform providing analytics, monitoring, and visualization solutions with a focus on user control and cost efficiency.

IT Services
1K-5K
Founded 2014
$535M raised

Description

  • Build and deliver AI features that help users detect, triage, and resolve incidents using observability data and tools.
  • Prototype, test, and iterate quickly on LLM- and agent-powered workflows for incident lifecycle management and automated analysis.
  • Collaborate with data analysts, product managers, and designers to shape AI-driven product features.
  • Integrate agentic components with internal tools, alerting systems, runbooks, and developer workflows.
  • Use AI and automation tools to improve both product functionality and development workflows.
  • Own AI solutions end to end, ensuring they are scalable, maintainable, and aligned with real user workflows.
  • Validate ideas early with real users and evolve features based on feedback.
  • Contribute across teams in a highly dynamic, collaborative environment.
  • Support automation efforts that improve infrastructure and observability quality.
  • Expand AI agent capabilities across the observability stack to assist with incident response.

Requirements

  • Strong software engineering experience building production backend and/or full-stack systems.
  • Experience with LLMs, prompt engineering, and building GenAI-powered applications.
  • Proven track record of shipping software to production that is actively used by users.
  • Experience working in cloud-native environments such as AWS, GCP, or Azure.
  • Experience using observability tools to understand and troubleshoot system behavior.
  • Ability to work with minimal supervision and solve complex engineering problems.
  • Comfort with rapid experimentation, prototype shipping, and iterative product development.
  • Ability to communicate effectively with peers, product managers, and designers.
  • Bonus: Experience building or working with agent frameworks or multi-agent workflows.
  • Bonus: Experience with Kubernetes, Docker, Terraform, or similar infrastructure and deployment tooling.

Benefits

  • Base salary in Canada of CAD 164,490 to CAD 197,389.
  • Restricted Stock Units (RSUs) included with all roles.
  • 100% remote work in a global, remote-first culture.
  • Global annual leave policy of 30 days per year.
  • 3 days of annual leave reserved for Grafana Shutdown Days.
  • In-person onboarding for new team members.
  • Company-funded budget for AI coding assistants and access to frontier models.
  • Defined career growth pathways.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Principal Software Engineer- AI Context

HubSpot 5K-10K Media

HubSpot’s AI Context group is hiring a Principal Software Engineer to shape the platform that powers AI-driven customer experiences across its ecosystem by leading complex, high-impact technical initiatives.

CRM Machine Learning
3 hours, 38 minutes ago

AI Engineer

Samsara 1K-5K IT Services

Samsara is hiring an AI Engineer IV to build and scale internal AI applications and GenAI capabilities that improve how teams across the company work.

Generative AI LLM Python REST API
3 hours, 53 minutes ago

SOX Data Analytics & AI Manager

SoFi 1K-5K Capital Markets

SoFi is hiring a SOX Data Analytics & AI Manager to lead automation and analytics efforts for its internal control and testing program across the company’s SOX environment.

Power BI Python Snowflake SQL Tableau
6 hours, 53 minutes ago

Specialist Solutions Architect - AI Tooling & Platform Management

Databricks 1K-5K IT Services

Databricks is hiring a Specialist Solutions Architect for AI Tooling & System Management to build the internal AI tooling, infrastructure, and workflows that help Field Engineering and Go-To-Market teams deliver customer engagements faster.

Apache Spark AWS Azure Databricks Encryption GCP GitOps Kafka MLflow Network Security Next.js Node.js Python React Terraform TypeScript
8 hours, 8 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers