Grafana

Grafana

Grafana is the open observability platform providing analytics, monitoring, and visualization solutions with a focus on user control and cost efficiency.

IT Services
1K-5K
Founded 2014
$535M raised

Description

  • Own end-to-end development of multi-agent AI systems, from architecture and implementation through testing, deployment, and ongoing operation.
  • Build modular, composable agentic systems using orchestration frameworks such as LangChain, CrewAI, Anthropic MCP, or similar tools.
  • Develop reusable agentic skills that can be used across Slack, dashboards, internal apps, and CLI interfaces.
  • Implement observability and feedback loops for AI systems, including logging, performance metrics, prompt iteration, model evaluation, and cost management.
  • Establish governance and compliance standards for AI workflows, including access controls, audit trails, PII handling, and human-in-the-loop escalation paths.
  • Build MCP servers, APIs, CLIs, and microservices that connect AI models to business systems such as BigQuery, Slack, CRMs, email, calendars, and analytics tools.
  • Architect retrieval-augmented generation data flows that connect LLMs to internal knowledge bases, customer data, and real-time business context.
  • Build serverless or containerized services on GCP, such as Cloud Functions and Cloud Run, that scale with usage.
  • Partner with RevOps, Demand Generation, Regional Marketing, Data Engineering, GTM Systems, Field Operations, and SDR teams to scope automation opportunities and deliver measurable outcomes.
  • Design and deploy self-service workflows with documentation, playbooks, enablement materials, CI/CD, testing, and production reliability standards.

Requirements

  • 8+ years of software engineering experience with depth in backend development, systems integration, or data/analytics engineering.
  • 2+ years of hands-on experience applying LLMs or AI to production workflows.
  • Strong proficiency in Python and JavaScript/Node.js, with Git-based workflows, code review practices, and testing discipline.
  • Hands-on experience with LLM patterns including prompt engineering, RAG, function calling/tool use, structured output parsing, and evaluation.
  • Experience building and operating multi-agent systems at scale, including agent decomposition, orchestration patterns, state management, and production monitoring.
  • Deep familiarity with Google Cloud Platform, BigQuery, and serverless/containerized services such as Cloud Functions and Cloud Run.
  • Understanding of LLM failure modes and production mitigations, including confidence thresholds, fallback logic, human escalation, and cost/latency management.
  • Proven ability to identify high-leverage problems, push back on low-impact requests, and deliver end-to-end with minimal direction.
  • Fluency with AI-assisted development tools such as GitHub Copilot, Cursor, Claude Code, Gemini CLI, or OpenAI Codex.
  • Clear technical communication skills, with the ability to explain complex systems to both engineers and business stakeholders.
  • Experience with vector databases or retrieval pipelines such as Pinecone, Weaviate, ChromaDB, Qdrant, or pgvector (bonus).
  • Familiarity with marketing or sales platforms such as Salesforce, Customer.io, HubSpot, Marketo, or Outreach (bonus).
  • Experience with frontend frameworks such as React or Slack Block Kit for building user-facing AI tool interfaces (bonus).
  • Experience with observability tooling for AI systems such as LangSmith, Weights & Biases, or custom evaluation frameworks (bonus).
  • Experience with workflow orchestration platforms such as n8n, Temporal, Prefect, or Airflow (bonus).
  • Familiarity with Model Context Protocol (MCP) or similar standards for connecting AI systems to data sources (bonus).
  • Prior work automating marketing, sales, or customer success workflows in a B2B SaaS environment (bonus).
  • Active participation in open-source communities is preferred.

Benefits

  • Base compensation range in Canada of CAD 164,490 to CAD 197,389.
  • All roles include Restricted Stock Units (RSUs).
  • 100% remote work with a global team across 40+ countries.
  • 30 days of annual leave per year.
  • 3 shutdown days included within annual leave for company-wide downtime.
  • In-person onboarding to help new hires ramp up effectively.
  • Career growth pathways and opportunities to develop within the company.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff Information Security Engineer - AI First

Rithum Internet Software & Services

Rithum is hiring a Staff AI-First Information Security Engineer to secure AI adoption across its commerce platform and internal operations by designing guardrails, automating controls, and reducing risk at scale.

AWS LLM Machine Learning Python SIEM Terraform
8 hours, 21 minutes ago

Senior Full Stack Engineer, Supply Tech

CookUnity 251-1K Hotels, Restaurants & Leisure

CookUnity is hiring a Senior Full Stack Engineer focused on backend work to build and maintain the recipe management systems that support chefs, operations, and meal production at scale.

AWS Azure CI/CD Docker GCP GraphQL Java Kafka Kotlin Kubernetes Microservices MySQL PostgreSQL RabbitMQ React REST API Spring Boot SQL Terraform
8 hours, 36 minutes ago

Staff Information Security Engineer - AI First

Rithum Internet Software & Services

Rithum is hiring a Staff AI-First Information Security Engineer to shape and enforce security guardrails for AI-powered products, AI-enabled workflows, and cloud enterprise environments across the company.

AWS Machine Learning Python SIEM Terraform
8 hours, 36 minutes ago

Founding Engineer

Fresh Prints 251-1K Textiles, Apparel & Luxury Goods

Frontier, a subsidiary of Fresh Prints, is hiring a Founding Engineer to build the core systems and engineering practices that support its AI-driven enterprise automation and workflows platform.

AWS Django FastAPI HubSpot PostgreSQL Python Railway
8 hours, 51 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers