Elastic

Elastic

Elastic is a leading platform for search-powered solutions, providing real-time insights and making data usable for developers and enterprises worldwide.

Internet Software & Services
1K-5K
Founded 2010

Description

  • Define the evaluation strategy for conversational and agentic search, including offline and online evaluation, golden datasets, rubrics, LLM-as-judge calibration, groundedness and citation checks, and A/B testing.
  • Lead the design of quality metrics and decision frameworks for RAG, agents, tools, model selection, agent routing, prompt behavior, and cost/latency trade-offs.
  • Build, compare, and improve retrieval and re-ranking approaches, including sparse and dense retrieval, vector search, query understanding, semantic rewrites, and context enrichment.
  • Turn experimental results into product and business decisions about model choice, request routing, tool exposure, and agent customization across Elastic use cases.
  • Partner with engineering to productionize evaluation pipelines, telemetry, dashboards, CI guardrails, and regression detection for chat quality, helpfulness, latency, and cost.
  • Influence roadmap direction by identifying high-leverage quality gaps, proposing practical solutions, and communicating trade-offs to product, engineering, and leadership.
  • Mentor data scientists and engineers in experiment design, evaluation methodology, statistical rigor, and LLM-powered system improvement.
  • Share outcomes through docs, notebooks, PRs, dashboards, technical proposals, and cross-functional reviews.

Requirements

  • 8+ years of applied data science or machine learning experience.
  • Deep expertise in information retrieval, NLP, ranking, semantic search, RAG, or LLM-powered product experiences.
  • Proven experience defining and leading evaluation for production AI/ML systems, including offline metrics, online experimentation, LLM-as-judge methods, groundedness, citation quality, and model comparison.
  • Hands-on experience with Python, PyTorch/Transformers, Pandas, notebooks, reproducible experiments, versioned datasets, and clean, reviewable code.
  • Strong understanding of retrieval systems, including dense and sparse retrieval, re-ranking, vector search, query understanding, and evaluation metrics such as nDCG, MRR, Recall@k, precision, and latency/cost trade-offs.
  • Experience collaborating with engineering teams to move from prototype to production, including telemetry design, dashboards, CI guardrails, and quality regression tracking.
  • Practical Elasticsearch experience or experience with similar search and distributed data systems.
  • ES|QL familiarity is a plus.
  • Excellent written and verbal communication skills for explaining technical trade-offs to engineering, product, design, and leadership audiences.
  • Collaborative, low-ego style with strong mentoring ability and a track record of raising standards in distributed teams.

Benefits

  • Competitive pay based on the work you do here and not your previous salary.
  • Health coverage for you and your family in many locations.
  • Flexible locations and schedules for many roles.
  • A generous number of vacation days each year.
  • Up to $2,000 (or local currency equivalent) in company matching for financial donations and service.
  • Up to 40 hours each year for volunteer projects you care about.
  • At least 16 weeks of parental leave.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Data Scientist

Greenhouse Software 251-1K Professional Services

Greenhouse is hiring a Data Scientist to join its Corporate Development and Analytics team, where the role turns product data into insights and data products that guide executive decision-making and long-term roadmap planning.

GCP Git Python Snowflake SQL
13 hours, 46 minutes ago

Data Science Experts

Weekday 11-50 Construction & Engineering

A leading AI lab is hiring a Data Science professional to help develop and evaluate training data for Large Language Models focused on quantitative reasoning, statistics, and analytics.

LLM Machine Learning Python R SQL
14 hours, 46 minutes ago

Data Scientist - Predictive Analytics & ML ( 10+ Years)

Enable Data 11-50 IT Services

Data Science Lead at a company focused on predictive analytics and machine learning, responsible for leading a team and delivering scalable ML solutions that turn business needs into actionable insights.

Apache Spark Azure CI/CD Deep Learning Feature Engineering Machine Learning MLOps Python Scikit-learn XGBoost
14 hours, 46 minutes ago

Mixed Methods Disability Researcher

American Institutes for Research 1K-5K Professional Services

AIR is seeking a Mixed Methods Disability Researcher to support disability-focused research, evaluation, and knowledge dissemination for its Employment and Economic Opportunity program.

R
1 day, 14 hours ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers