LTS

Internet Software & Services

Information Technology

251-1K (960)

Founded 2005

2 open positions

Links

View All Jobs

RAG and Evaluation Engineer

2 weeks, 1 day ago

United States

Full-time

Mid Level

AI (Artificial Intelligence)

Data Science and Analytics

Python TypeScript

Apply Now

LTS

Internet Software & Services

251-1K

Founded 2005

View All Jobs 2

Description

Own the knowledge surface by building ingestion pipelines for source code, structured metadata, technical documentation, patches, and other customer-provided corpora.
Own retrieval quality across chunking, embeddings, hybrid retrieval, reranking, and freshness.
Own the evaluation harness for translation accuracy, dependency-map correctness, and overall agent quality.
Run A/B tests and regression detection across prompts, retrieval, and model changes.
Close the feedback loop by using production usage signals to improve evals and retrieval.
Define success metrics and determine whether the agent is actually improving when the team does not yet have a clear baseline.
Pair with Agent Engineers on the prompt-and-eval iteration cycle.

Requirements

Bachelor’s degree in Computer Science, Engineering, Information Science, or a related field, plus 4 years of professional software engineering experience; equivalent experience may substitute for the degree requirement.
Production experience shipping a RAG system with measurable quality.
Experience with retrieval pipelines, including ingestion, chunking, embedding, hybrid retrieval, and reranking.
Strong applied evaluation skills, including benchmark design, regression detection, and LLM-as-judge patterns.
Ability to work in a fast-paced, collaborative environment.
Heavy native use of AI tooling, including agents in parallel and model-as-collaborator workflows.
Strong TypeScript or Python skills.
Demonstrated experience in a remote work environment.
Ability to measure shipped systems with benchmarks and data-backed opinions on chunking and retrieval.
Comfort defining metrics before the team has fully aligned on them.
Nice to have: code-as-corpus retrieval experience.
Nice to have: applied IR or search-engine background.
Nice to have: synthetic data generation and LLM-as-judge experience.
Nice to have: open-source contributions to retrieval, evaluation, or RAG tooling.
Nice to have: experience integrating retrieval feedback loops with production usage.
Nice to have: healthcare IT or legacy modernization domain experience.
Nice to have: public technical writing or conference talks on retrieval or evaluation.

Benefits

Opportunity to support high-visibility federal missions in IT and healthcare.
A culture that values innovation, growth, collaboration, and quality.
Access to cutting-edge tools and technologies.
Comprehensive benefits for employees and their families.
A career path that rewards ambition and performance.
Salary transparency with compensation ranges shared upfront.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Intern, Forward Deployed Engineering

Workato 251-1K IT Services

Workato is hiring a Forward Deployed Engineering intern to support AI-driven automation initiatives by helping build intelligent agents and enterprise workflow integrations on its Agentic AI platform.

India Internship Entry Level AI Engineer Software Engineer

JavaScript JSON LLM Python REST API Salesforce

12 hours, 41 minutes ago

Apply

12 hours, 41 minutes ago

Mortgage Underwriter - Freelance AI Trainer

Mindrift.ai: Be the “I” in AI Internet Software & Services

Mindrift is seeking mortgage underwriting and loan origination professionals for project-based AI evaluation work focused on testing and improving mortgage-related AI outputs and compliance decisions.

Greece Portugal Italy Part-time Mid Level AI (Artificial Intelligence) Financial Analyst

Up to $104k

12 hours, 56 minutes ago

Apply

12 hours, 56 minutes ago

Downeast Cider - AI Full Stack Developer

Jobrack 11-50 Professional Services

Downeast Cider is hiring an AI Full Stack Developer to become its first technical employee and build production-ready internal tools that improve operations across the business.

South Africa North Macedonia Poland Romania Serbia Georgia Contract Senior AI Engineer Full-stack Engineer

$60k-$72k

CRM GCP JavaScript NetSuite Python Shopify Snowflake SQL TypeScript

12 hours, 56 minutes ago

Apply

12 hours, 56 minutes ago

Claims Processing Agent - Freelance AI Trainer

Mindrift.ai: Be the “I” in AI Internet Software & Services

Mindrift is seeking part-time project-based insurance and claims specialists to evaluate and improve AI systems for auto insurance decision-making, fraud detection, and subrogation testing.

United States Part-time Mid Level AI (Artificial Intelligence)

Up to $125k

12 hours, 56 minutes ago

Apply

12 hours, 56 minutes ago

LTS

Tags

Links

RAG and Evaluation Engineer

LTS

Description

Requirements

Benefits

Similar Roles

Intern, Forward Deployed Engineering

Mortgage Underwriter - Freelance AI Trainer

Downeast Cider - AI Full Stack Developer

Claims Processing Agent - Freelance AI Trainer

You're on a roll! Sign up now to keep applying.