Senior Manager, Engineering - Observability Platform (Remote Eligible)

3 hours, 36 minutes ago
Full-time
Lead
Software Development
Smartsheet

Smartsheet

Smartsheet provides an enterprise work management platform that enables teams to efficiently manage projects, automate processes, and enhance collaboration through a user-friendly interface that combines spreadsheet functionality with advanced workflow...

Internet Software & Services
1K-5K
Founded 2005

Description

  • Lead a team of engineers building a unified observability platform used across Smartsheet.
  • Own and evolve the observability platform roadmap, consolidating multiple tooling platforms into a scalable capability.
  • Define platform standards, contribute to architectural direction, and establish strong operational practices.
  • Hire and scale the team, including recruiting senior engineers and supporting distributed global collaboration.
  • Design and deliver centralized observability infrastructure for metrics, distributed tracing, alerting, and log analytics.
  • Drive SLO and SLA definition, tooling, and reliability visibility in partnership with infrastructure and on-call teams.
  • Own instrumentation governance, cost optimization, and rollout of observability capabilities such as APM, RUM, and dashboards.
  • Build and maintain AI/ML observability integrations with the AI Platform team.
  • Develop dashboards and alerting for agentic AI workloads, including latency, token usage, error rates, and evaluation drift.
  • Lead cross-functional observability initiatives, manage delivery risk, and report platform status and progress to leadership.

Requirements

  • 10+ years of software or platform engineering experience with strong fundamentals in distributed systems, infrastructure, and backend services.
  • 3 years of engineering management experience, including team building, performance management, and cross-functional delivery ownership.
  • Deep hands-on expertise with Datadog (APM, metrics, logs, alerting), OpenSearch or Elasticsearch, distributed tracing (OpenTelemetry or equivalent), and SLO/SLA management at scale.
  • Experience operating observability platforms for high-availability, high-throughput production environments.
  • Experience building and scaling engineering teams in distributed or international settings.
  • Strong execution track record on complex, cross-functional infrastructure programs with high ambiguity.
  • Clear written and verbal communication with technical and non-technical audiences, including leadership and executives.
  • Experience managing vendors, external delivery partners, and third-party integrations in a platform context.
  • Hands-on experience with AI/ML observability, such as MLflow tracing, LLM evaluation pipelines, or observability for agentic AI systems, preferred.
  • Familiarity with Amazon Bedrock, ECS Fargate, or LangGraph-based multi-agent architectures, preferred.
  • Experience with cloud cost governance and FinOps practices for observability tooling, preferred.
  • Exposure to data platform observability and data quality monitoring in a lakehouse context, preferred.
  • CS, Engineering, or equivalent degree, or commensurate practical experience.
  • Legally eligible to work in the U.S. on an ongoing basis.

Benefits

  • Employer-subsidized medical, vision, and dental coverage for full-time employees.
  • 401(k) match of 50% of your contribution up to the first 6% of eligible pay.
  • Monthly stipend to support work and productivity.
  • Flexible Time Away Program plus sick time off.
  • Company-sponsored life insurance, short-term disability, and long-term disability coverage for U.S. employees.
  • 12 paid holidays per year for U.S. employees.
  • Up to 24 weeks of parental leave.
  • Personal paid Volunteer Day.
  • Professional growth and development opportunities, including access to Udemy online courses.
  • Company-funded perks including a counseling membership, local retail discounts, and a personal Smartsheet account.
  • Teleworking options from any registered location in the U.S. for this role.
  • Market-competitive incentive opportunity in addition to base salary.
  • US base salary range of $205,000 to $275,000.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

[Local Schemes] Software Engineering Manager I

Harford County Public Library 51-250 Diversified Consumer Services

Stone Tech is hiring a Software Engineering Manager I to lead the Pix integration team in building and evolving highly available microservices that connect Stone and Pagar.me to critical Banco Central infrastructure.

Agile C# CI/CD Docker Go HTTP Kanban Kubernetes Microservices REST API Scrum Solid.js
3 hours, 21 minutes ago

Senior Engineering Manager, Privacy Security

instacart.careers 1K-5K Internet Software & Services

Instacart is hiring a Senior Engineering Manager to lead its Privacy Engineering team within Security Engineering and advance privacy-by-design capabilities across products, data systems, and user-facing privacy workflows.

3 hours, 36 minutes ago

Senior Engineering Manager, Privacy Security

instacart.careers 1K-5K Internet Software & Services

Instacart is hiring a Senior Engineering Manager to lead its Privacy Engineering team within Security Engineering, driving privacy-by-design capabilities across product and platform systems.

4 hours, 6 minutes ago

Development Manager (FTC)

Raspberry Pi Foundation 51-250 Diversified Consumer Services

Development Manager at the Raspberry Pi Foundation, based in Cambridge or remote in the UK with regular travel, leading strategic fundraising partnerships to support global computing and AI education for young people.

CRM Salesforce
4 hours, 6 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers