Elastic

Elastic

Elastic is a leading platform for search-powered solutions, providing real-time insights and making data usable for developers and enterprises worldwide.

Internet Software & Services
1K-5K
Founded 2010

Description

  • Define the evaluation strategy for conversational and agentic search, including offline and online evaluation, golden datasets, rubrics, LLM-as-judge calibration, groundedness and citation checks, and A/B testing.
  • Lead the design of quality metrics and decision frameworks for RAG, agents, tools, model selection, agent routing, prompt behavior, and cost/latency trade-offs.
  • Build, compare, and improve retrieval and re-ranking approaches, including sparse and dense retrieval, vector search, query understanding, semantic rewrites, and context enrichment.
  • Turn experimental results into product and business decisions about model choice, request routing, tool exposure, and agent customization across Elastic use cases.
  • Partner with engineering to productionize evaluation pipelines, telemetry, dashboards, CI guardrails, and regression detection for chat quality, helpfulness, latency, and cost.
  • Influence roadmap direction by identifying high-leverage quality gaps, proposing practical solutions, and communicating trade-offs to product, engineering, and leadership.
  • Mentor data scientists and engineers in experiment design, evaluation methodology, statistical rigor, and LLM-powered system improvement.
  • Share outcomes through docs, notebooks, PRs, dashboards, technical proposals, and cross-functional reviews.

Requirements

  • 8+ years of applied data science or machine learning experience.
  • Deep expertise in information retrieval, NLP, ranking, semantic search, RAG, or LLM-powered product experiences.
  • Proven experience defining and leading evaluation for production AI/ML systems, including offline metrics, online experimentation, LLM-as-judge methods, groundedness, citation quality, and model comparison.
  • Hands-on experience with Python, PyTorch/Transformers, Pandas, notebooks, reproducible experiments, versioned datasets, and clean, reviewable code.
  • Strong understanding of retrieval systems, including dense and sparse retrieval, re-ranking, vector search, query understanding, and evaluation metrics such as nDCG, MRR, Recall@k, precision, and latency/cost trade-offs.
  • Experience collaborating with engineering teams to move from prototype to production, including telemetry design, dashboards, CI guardrails, and quality regression tracking.
  • Practical Elasticsearch experience or experience with similar search and distributed data systems.
  • ES|QL familiarity is a plus.
  • Excellent written and verbal communication skills for explaining technical trade-offs to engineering, product, design, and leadership audiences.
  • Collaborative, low-ego style with strong mentoring ability and a track record of raising standards in distributed teams.

Benefits

  • Competitive pay based on the work you do here and not your previous salary.
  • Health coverage for you and your family in many locations.
  • Flexible locations and schedules for many roles.
  • A generous number of vacation days each year.
  • Up to $2,000 (or local currency equivalent) in company matching for financial donations and service.
  • Up to 40 hours each year for volunteer projects you care about.
  • At least 16 weeks of parental leave.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Quantitative Analyst

BHG Financial 1K-5K Diversified Financial Services

BHG Financial is hiring a Senior Quantitative Analyst to support revenue strategy and planning with data-driven analysis that connects marketing performance to business outcomes across a remote, cross-functional financial services environment.

Power BI Python R SQL Tableau
28 minutes ago

AI Value Partner, Customer Analytics

Cresta 51-250 Professional Services

Cresta is hiring an AI Value Partner, Customer Analytics to join its Customer Success organization and measure customer value through analytics, experimentation, dashboards, and reporting that support strategic customer engagements.

Looker Machine Learning NLP NumPy Pandas Python Scikit-learn Snowflake SQL Statistics Tableau
9 hours, 18 minutes ago

Senior Data Scientist Job

Murmuration 11-50 Diversified Consumer Services

Murmuration is hiring a Senior Data Scientist to lead full-stack data science work that powers campaign and organizing tools through predictive modeling, data pipelines, and actionable insights.

Dagster dbt Git Machine Learning Python SQL
9 hours, 59 minutes ago

Environmental Modeler Specialist - Noise and Air

Olsson 1K-5K Construction & Engineering

Olsson is hiring an Engineer or Scientist for its Industrial Environmental team to support environmental compliance and planning for industrial projects through noise and environmental modeling.

14 hours, 53 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers