PointClickCare

PointClickCare

PointClickCare provides a leading cloud-based healthcare software platform that enables long-term and post-acute care providers to effectively manage the complete lifecycle of resident care while enhancing operational efficiency and improving resident ...

Health Care Providers & Services
1K-5K
Founded 2000
$232M raised

Description

  • Own the gold data layer by transforming silver tables into curated, semantically rich, documented gold datasets for AI development.
  • Reverse-engineer data semantics by working with product engineers, clinical experts, and workflow experts to understand how data is created and represented.
  • Bridge researcher needs with data design by translating AI applied research requirements into reusable gold data products and documentation.
  • Curate datasets across modalities, including structured tables, unstructured content, features, labels, and chunked/tagged data for different AI use cases.
  • Build reusable silver-to-gold data pipelines in Databricks/Spark as scheduled and observable workloads.
  • Automate data quality, filtering, synthesis, labeling, and weak supervision workflows for AI data preparation.
  • Maintain reproducible dataset snapshots, lineage, and semantic definitions for downstream AI R&D reuse.
  • Collaborate with AI researchers, data platform, product, clinical, and workflow teams throughout the R&D lifecycle.
  • Support model development, evaluation, experimentation, and operational sustaining across classical ML, generative AI, RAG, and agentic approaches.

Requirements

  • 5+ years building production data systems, including at least 2 years supporting ML or AI workloads.
  • Advanced Python, SQL, and PySpark/Databricks experience for working with large, messy data.
  • Expert-level SQL and the ability to read complex stored procedures and reverse-engineer business logic from queries.
  • Strong Databricks ecosystem experience, including Delta Lake, Unity Catalog, Spark/PySpark tuning, and MLflow.
  • Working knowledge of AI concepts such as embeddings, tokenization, feature engineering, point-in-time correctness, train/validation/test splits, and data drift.
  • Experience transforming unstructured data such as text, PDFs, transcripts, and logs into model-ready forms.
  • Familiarity with AI-friendly storage and formats such as Parquet and Hugging Face datasets, plus partitioning, sharding, and caching concepts.
  • Experience with data quality and synthesis techniques such as programmatic labeling, weak supervision, MinHash/LSH, and LLM-generated synthetic data.
  • Experience with pipeline orchestration and dataset versioning tools such as Airflow, Databricks Workflows, Dagster, Prefect, and Unity Catalog.
  • Experience handling regulated or sensitive data under controlled access, including HIPAA or equivalent, and familiarity with de-identification concepts.
  • Git-based version control and CI/CD experience for data and code.
  • Strong written documentation skills and the ability to elicit requirements from technical and non-technical experts.
  • Bachelor’s degree in computer science, data science, engineering, statistics, or a related field, or equivalent practical experience.
  • Preferred: Hands-on EHR data experience in skilled nursing, long-term care, post-acute care, or senior living.
  • Preferred: Working knowledge of clinical terminologies and data standards such as ICD-10, SNOMED CT, LOINC, HL7v2, FHIR, and CCDA.
  • Preferred: dbt experience for transformation and testing.
  • Preferred: Familiarity with training-side ML frameworks such as PyTorch to debug data-side bottlenecks.
  • Preferred: Experience supporting LLM or foundation-model training or fine-tuning data pipelines.
  • Preferred: Clinical NLP, OCR, document parsing, or ASR/transcript pipeline experience.
  • Preferred: Experience with data lineage and catalog tools.
  • Preferred: Prior experience embedded inside an AI or ML research team.
  • Preferred: Master’s degree in a relevant quantitative or computer science field.

Benefits

  • Benefits starting from day 1.
  • Retirement plan matching.
  • Flexible paid time off.
  • Wellness support programs and resources.
  • Parental and caregiver leaves.
  • Fertility and adoption support.
  • Continuous development support program.
  • Employee assistance program.
  • Allyship and inclusion communities.
  • Employee recognition and more.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Data Conversion Software Engineer

Career TEAM 251-1K Professional Services

Career Team is hiring a Data Conversion Software Engineer to build data transformation and integration software for government-funded workforce development programs across the United States.

Agile Angular CI/CD Docker Express.js JavaScript JSON MongoDB NestJS Next.js Node.js React Scrum TypeScript XML
13 hours, 19 minutes ago

Biology & Biophysics Researchers (India, Part-time)

Weekday 11-50 Construction & Engineering

An AI lab client is hiring part-time life science researchers to help train and evaluate frontier AI systems on advanced biological and biophysical reasoning.

Machine Learning
13 hours, 19 minutes ago

Sr. Associate Data Platform - Remote

TWO95 International 51-250 Internet Software & Services

Sr. Associate Data Platform is a contract role with a Los Angeles-based team supporting Adobe analytics and data platform implementation work across on-site and remote locations.

CSS Digital Marketing HTML JavaScript jQuery Vue.js
13 hours, 34 minutes ago

Freelance Data Scraping Engineer (Python)

Mindrift.ai: Be the “I” in AI Internet Software & Services

Mindrift is hiring a part-time remote Python Data Scraping Engineer for the Tendem project to deliver accurate, structured data extraction and processing within a hybrid AI-plus-human workflow.

AJAX GitHub JavaScript JSON LLM Python Selenium
13 hours, 34 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers