Data Scientist/Machine Learning Engineer

1 month, 3 weeks ago
Full-time
Mid Level
Software Development

Sumble

Sumble provides AI-powered account intelligence for enterprise sales teams, helping users understand team structures, reporting lines, tech stacks, and other account signals to drive pipeline.

Technology, Information and Internet
11-50
$38M raised

Description

  • Finetune small language models for data quality and enrichment workflows.
  • Improve the quality of existing data using scalable methods and validation techniques.
  • Verify and correct entity relationships such as company URLs, headquarters addresses, and parent-subsidiary mappings.
  • Add new signals by scrubbing, matching, normalizing, and aligning them to the existing ontology.
  • Push data quality solutions into production and support them in data pipelines and backend systems.
  • Use techniques such as LLM validation, SERP checks, and cross-source triangulation to improve accuracy.
  • Work with growing sets of data sources, machine learning models, and large-scale data operations.
  • Contribute to systems that support efficient analytics and a strong product-led growth experience.

Requirements

  • Must be located within Americas time zones.
  • Experience working with small language models, LLMs, or machine learning-based data workflows is implied by the role.
  • Familiarity with Python and backend or data pipeline environments.
  • Experience with data cleaning, matching, normalization, or ontology mapping is relevant for the role.
  • Knowledge of ML/data tooling such as PyTorch, Hugging Face, Gemma models, LoRA, or vLLM is a plus.
  • Experience with FastAPI, React, Typescript, PostgreSQL, DuckDB, or Google Cloud Platform is a plus.
  • Ability to work on production systems and operational data infrastructure.
  • Experience in environments handling noisy datasets and multiple data sources is a plus.

Benefits

  • Medical, dental, and vision coverage for US employees.
  • 401(k) plan for US employees.
  • Target of 4 weeks of PTO.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

AI Security Engineer - Mid-Atlantic region (Remote in VA, MD, PA, NC, DE, NJ, or DC)

GuidePoint Security 251-1K Internet Software & Services

GuidePoint Security is hiring an AI Security Engineer to help customers design, implement, secure, and operate generative AI security solutions across enterprise environments.

Cybersecurity Generative AI LLM Python SageMaker Terraform
10 hours, 7 minutes ago

Data Scientist 1

Adswerve 251-1K Media

Adswerve is hiring a Data Scientist 1 for its Tech Services team to work on client-facing data engineering and analytics solutions that improve marketing performance and business outcomes.

AWS Azure GCP Google Ads Google Analytics HTML JavaScript Machine Learning Python SQL
10 hours, 22 minutes ago

Principal Data Scientist - Agent Builder

Elastic 1K-5K Internet Software & Services

Elastic is hiring a Principal Data Scientist to shape evaluation and quality strategy for its conversational and agentic search platform built on Elasticsearch, with the goal of turning ambiguous AI search problems into reliable product improvements.

Elasticsearch LLM Machine Learning NLP Pandas Python PyTorch Transformers
10 hours, 22 minutes ago

Machine Learning Engineer, Next-Generation Recommendation Systems (New Grad / PhD)

Unity 5K-10K Internet Software & Services

Unity’s Vector AI team is hiring a PhD graduate to develop and productionize large-scale ranking and recommendation systems that optimize ad relevance, user value, and delivery outcomes across billions of monthly users.

Feature Engineering LLM Machine Learning Python PyTorch Reinforcement Learning TensorFlow
10 hours, 22 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers