Plume

Plume

Plume provides virtual gender-affirming healthcare services specifically designed for the transgender and gender nonconforming community, offering accessible and supportive care through a mobile platform.

Family Services
51-250
Founded 2019
$38M raised

Description

  • Build and maintain production-grade data pipelines in cloud data warehouses such as BigQuery or equivalent.
  • Design and develop dbt models across bronze, silver, and gold layers with testing, documentation, and incremental loading.
  • Create and optimize Airflow DAGs for scheduling, dependencies, monitoring, error handling, and alerting.
  • Implement dimensional data models and data mart structures that support clinical BI and ML feature consumption.
  • Build dashboards and visualizations in Looker or equivalent BI tools in collaboration with cross-functional stakeholders.
  • Integrate healthcare data from EHRs, Stripe, third-party APIs, and application databases into a unified data platform.
  • Apply HIPAA-compliant data handling practices, including PHI/PII masking, tokenization, audit logging, and access controls.
  • Architect and implement RAG pipelines, including document ingestion, chunking, embedding generation, and retrieval.
  • Support MLOps workflows, including training pipeline maintenance, deployment support, monitoring, and retraining triggers.
  • Code review teammates’ pull requests, provide technical feedback, and uphold engineering standards.
  • Collaborate with product managers to define requirements and deliver reliable data and AI products.
  • Monitor and triage pipeline and data quality failures, escalating architectural issues when needed.
  • Document pipeline designs, data models, and technical decisions to support governance and lineage tracking.
  • Evaluate new tools and frameworks through hands-on prototyping and technical assessment.

Requirements

  • 5+ years of hands-on experience in data engineering, analytics engineering, or a closely related role.
  • 2+ years of experience in the healthcare industry with knowledge of healthcare data standards, clinical workflows, regulated data environments, and domain-specific reporting.
  • Working knowledge of HIPAA, including PHI/PII classification, data masking, audit logging, and access control requirements.
  • Production experience with at least one major cloud data warehouse: BigQuery, Snowflake, or Redshift.
  • Strong hands-on experience with dbt Core or dbt Cloud, including incremental models, tests, documentation, and multi-environment workflows.
  • Deep experience with Apache Airflow for orchestration, including DAG design, scheduling, monitoring, and failure handling.
  • Demonstrated knowledge of dimensional modeling, including star and snowflake schemas, SCD Type 1/2, and fact/dimension table design.
  • Hands-on experience delivering dashboards and reports in an enterprise BI tool such as Looker, Power BI, Tableau, or Qlik.
  • Proficiency in Python for data pipelines, API integrations, and automation, including Pandas, PySpark, or similar.
  • Practical exposure to RAG pipeline development and LLM integration using LangChain, LangGraph, or LlamaIndex.
  • Hands-on exposure to MLOps concepts, including model deployment, monitoring, and retraining workflows.
  • Knowledge of CI/CD tooling for data and AI workloads, such as GitHub Actions or dbt Cloud CI.
  • Strong understanding of data quality and governance principles, including lineage, access controls, data contracts, and automated testing.
  • Experience with data governance tools such as OpenMetadata.
  • Excellent written and verbal communication skills and the ability to collaborate across engineering, analytics, and clinical teams.
  • Ability to work independently while keeping leadership informed of progress, blockers, and risks.
  • Experience with real-time or streaming data pipelines using Kafka, Kinesis, or Pub/Sub is preferred.
  • Knowledge of vector databases such as Pinecone, Weaviate, FAISS, or Chroma is preferred.
  • Familiarity with responsible AI principles, including bias detection and model explainability in healthcare, is preferred.
  • Experience with data observability tools such as Monte Carlo, Bigeye, or Soda is preferred.
  • Familiarity with data lakehouse patterns such as Delta Lake, Iceberg, or Apache Hudi is preferred.
  • Experience working toward or maintaining SOC2 or HITRUST certification is preferred.
  • Familiarity with semantic layer tools such as Looker LookML or dbt Semantic Layer is preferred.
  • Experience with population health, revenue cycle, or clinical quality reporting datasets is preferred.
  • Exposure to Kubernetes or containerized ML workloads is preferred.
  • Must be legally authorized to work in the USA and reside in the USA.

Benefits

  • $158,000 - $168,000 annual salary.
  • Ground-floor equity (Series B).
  • Free medical, dental, and vision coverage starting the first of the month after full-time start.
  • Unlimited PTO.
  • 11 paid holidays plus a company shutdown for one week in December.
  • 401(k) retirement plan.
  • Free Plume and BetterHelp subscriptions.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior DataOps Engineer- Remote US

Smile Digital Health 251-1K IT Services

Smile Digital Health is hiring a Senior DataOps Engineer to own analytics infrastructure and data processing environments for its remote U.S. clinical intelligence product team.

Ansible Apache Airflow Apache Spark AWS Azure CI/CD Databricks GCP GitHub Actions GitLab CI Java Jenkins Kubernetes Linux Prefect Python Scala Terraform
37 minutes ago

Staff Data Engineer

CookUnity 251-1K Hotels, Restaurants & Leisure

CookUnity is hiring a Data Engineer to help rebuild and scale the company’s B2C data foundation by designing production-ready pipelines and data systems for a rapidly growing food marketplace.

Apache Spark Flink Java Kafka Kubernetes Python Scala Snowflake SQL
3 hours, 1 minute ago

Data Engineering Team Lead (Agentic Search)

Nebius 51-250 Internet Software & Services

Nebius is seeking a Data Engineering Team Lead to own the data platform supporting its agent-native search product, spanning ingestion, warehouse architecture, analytics, and trustworthy datasets for product and business decisions.

Apache Airflow Apache Spark AWS dbt GCP Kafka MapReduce Python Snowflake SQL
4 hours, 5 minutes ago

GCP Data Architect

66degrees 251-1K IT Services

66degrees is seeking an experienced Data Architect to design, develop, and maintain Google Cloud-based data architecture that turns enterprise data into scalable, reliable business value.

Apache Spark dbt GCP Hadoop Python SQL
4 hours, 8 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers