Generalist - English & Hindi

2 days ago
Contract
Junior
Artificial Intelligence and Machine Learning
Weekday

Weekday

Weekday helps companies hire engineers who are vouched by other software engineers, enabling passive income for engineers. They offer services like drafting outreach messages, shortlisting candidates, and conducting reference checks. Backed by Y Combin...

Construction & Engineering
11-50
Founded 2020

Description

  • Evaluate AI-generated responses for accuracy, relevance, and effectiveness in answering user queries.
  • Perform fact-checking using trusted public sources and external tools to identify factual errors.
  • Create high-quality evaluation data by identifying strengths, weaknesses, and factual inaccuracies in outputs.
  • Assess reasoning quality, clarity, tone, and completeness of model responses.
  • Ensure responses align with expected conversational standards and system guidelines.
  • Apply consistent annotations using defined taxonomies, benchmarks, and evaluation frameworks.
  • Compare multiple model outputs and make detailed qualitative judgments to inform improvements.
  • Provide clear, structured feedback that directly supports model performance improvements and user experience.
  • Deliver consistent, high-quality evaluation outputs to contribute to building reliable AI systems used at scale.

Requirements

  • Bachelor’s degree in any discipline.
  • Native-level fluency in Hindi (ILR 5 / CEFR C2) and strong proficiency in English.
  • Hands-on experience using large language models (LLMs) and understanding their real-world applications.
  • Excellent writing skills with the ability to provide clear, structured, and nuanced feedback.
  • Strong attention to detail and the ability to identify subtle issues in content.
  • Comfortable working across diverse topics, domains, and requirements.
  • Background in fields requiring structured analytical thinking such as research, analytics, policy, linguistics, or engineering.
  • Strong college-level mathematics and reasoning skills.
  • Nice-to-have: experience with RLHF (Reinforcement Learning from Human Feedback), model evaluation, or data annotation.
  • Nice-to-have: background in content writing, editing, quality review, familiarity with evaluation rubrics/benchmarks, and experience comparing multiple outputs.

Benefits

  • Compensation: $12.19 per hour (contract rate).
  • Flexible, fully remote contractor work with the ability to manage your own schedule.
  • Competitive compensation aligned with expertise and contribution level.
  • Weekly payments processed via Stripe or Wise based on completed work.
  • Independent contractor engagement with project duration varying by performance and business needs.
  • Work involves only publicly available information (no access to confidential/proprietary data).
  • Opportunity to work at the forefront of human-in-the-loop AI development and contribute to conversational AI systems used globally.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Delta Crateris - AI Content Evaluator - French (Canada)

Welocalize 1K-5K Professional Services

Freelance AI Content Evaluator (French - Canada) at Welo Data (Welocalize) working remotely to review and annotate content against project guidelines to ensure consistent, high-quality data for AI models.

16 hours, 36 minutes ago

Delta Crateris - AI Content Evaluator - Bulgarian (Bulgaria)

Welocalize 1K-5K Professional Services

Welo Data is hiring a freelance AI Content Evaluator (Bulgarian) to remotely review and annotate content against defined guidelines, ensuring consistent, high-quality annotations that support AI training and improve user experience.

16 hours, 51 minutes ago

Machine Learning Solution Architect

Provectus 251-1K Professional Services

Solutions Architect at Provectus responsible for designing, planning, and implementing scalable cloud and on‑prem data and ML architectures to deliver end-to-end AI/ML transformations and drive measurable customer value while working remotely across LATAM locations.

AWS AWS CDK Azure CloudFormation Docker GCP Generative AI Java Kubernetes Machine Learning Microservices MLflow MLOps Neo4j Python PyTorch SageMaker Terraform TypeScript
17 hours, 6 minutes ago

Delta Crateris - AI Content Evaluator - Danish (Denmark)

Welocalize 1K-5K Professional Services

Freelance AI Content Evaluator at Welo Data (Welocalize) working remotely in Denmark to review and annotate Danish content against project guidelines to ensure consistent, high-quality data for AI models.

17 hours, 6 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers