PointClickCare

PointClickCare

PointClickCare provides a leading cloud-based healthcare software platform that enables long-term and post-acute care providers to effectively manage the complete lifecycle of resident care while enhancing operational efficiency and improving resident ...

Health Care Providers & Services
1K-5K
Founded 2000
$232M raised

Description

  • Own the gold data layer that transforms silver tables into curated, semantically rich, documented datasets for AI model development.
  • Reverse-engineer data semantics by working with product engineers, clinical experts, and workflow experts to understand source systems and data behavior.
  • Bridge researcher needs with data design by translating AI applied research requirements into reusable gold data products.
  • Curate datasets across modalities, including structured tables and unstructured content for generative AI, RAG, predictive modeling, and classical ML use cases.
  • Build reusable transformation pipelines from silver to gold in Databricks/Spark as scheduled and observable workloads.
  • Automate data quality, filtering, labeling, synthesis, and weak supervision workflows for AI research data preparation.
  • Maintain reproducible dataset snapshots, lineage, and semantic definitions for downstream AI R&D teams.
  • Document datasets, transformations, provenance, and quirks so others can reliably reuse the data assets.
  • Collaborate with AI researchers, data platform teams, product teams, and clinical/workflow experts across the R&D lifecycle.
  • Support model development, experimentation, evaluation, and operational sustaining by delivering AI-ready data assets.

Requirements

  • 5+ years building production data systems, including at least 2 years supporting ML or AI workloads.
  • Advanced Python, SQL, and PySpark/Databricks experience for large, messy data.
  • Expert-level SQL ability, including reading complex stored procedures and reverse-engineering business logic from queries.
  • Strong Databricks ecosystem experience, including Delta Lake, Unity Catalog, Spark/PySpark tuning, and MLflow.
  • Working knowledge of AI concepts such as embeddings, tokenization, feature engineering, point-in-time correctness, data drift, and train/validation/test splits.
  • Experience transforming unstructured content such as text, PDFs, transcripts, and logs into model-ready data.
  • Experience with AI-friendly formats and storage layouts such as Parquet and Hugging Face datasets, including partitioning, sharding, and caching.
  • Experience with pipeline orchestration and dataset versioning tools such as Airflow, Databricks Workflows, Dagster, or Prefect.
  • Experience handling regulated or sensitive data under controlled access, such as HIPAA or equivalent, with familiarity in de-identification concepts.
  • Git-based version control and CI/CD experience for data and code.
  • Strong written documentation skills and ability to elicit requirements from technical and non-technical experts.
  • Bachelor’s degree in computer science, data science, engineering, statistics, or a related field, or equivalent practical experience.
  • Preferred: hands-on EHR data experience in skilled nursing, long-term care, post-acute care, or senior living.
  • Preferred: working knowledge of clinical terminologies and standards such as ICD-10, SNOMED CT, LOINC, HL7v2, FHIR, and CCDA.
  • Preferred: dbt experience for transformation and testing.
  • Preferred: familiarity with training-side ML frameworks such as PyTorch and experience supporting LLM or foundation-model training/fine-tuning pipelines.
  • Preferred: experience with clinical NLP, OCR, document parsing, or ASR/transcript pipelines.
  • Preferred: experience with data lineage and catalog tools.
  • Preferred: prior experience embedded within an AI or ML research team.
  • Preferred: master’s degree in a relevant quantitative or computer science field.

Benefits

  • Benefits starting from Day 1.
  • Retirement plan matching.
  • Flexible paid time off.
  • Wellness support programs and resources.
  • Parental and caregiver leave.
  • Fertility and adoption support.
  • Continuous development support program.
  • Employee assistance program.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Staff Software Engineer - Data Platform

Marqeta 251-1K Diversified Financial Services

Marqeta is hiring a software engineer to own the company’s data platform foundation, building the lakehouse and streaming ingestion systems that power its broader data and ML organization.

Apache Airflow Apache Spark AWS Go Java Kafka Python
5 hours, 35 minutes ago

Senior Multidisciplinary Analysis & Simulation Engineer

Relativity Space 251-1K Aerospace & Defense

Relativity Space is hiring an Integrated Performance engineer to help design, analyze, and optimize Terran R vehicle and mission performance across the launch system.

Docker Julia MATLAB Python
5 hours, 35 minutes ago

Senior Vehicle Simulation Engineer

Relativity Space 251-1K Aerospace & Defense

Relativity Space is hiring an Integrated Performance team member to help develop and analyze Terran R vehicle and mission designs across simulation, trajectory optimization, and cross-functional engineering decisions.

Docker Julia MATLAB Python
5 hours, 35 minutes ago

Vehicle Simulation Engineer II

Relativity Space 251-1K Aerospace & Defense

Relativity Space is hiring an Integrated Performance engineer to support Terran R launch system development by integrating design data, running mission simulations, and informing vehicle and trajectory decisions.

Docker Julia MATLAB Python
5 hours, 50 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers