Data/Infrastructure Advocate Engineer - US Remote

2 weeks, 1 day ago
Full-time
Mid Level
DevOps and Infrastructure
Hugging Face

Hugging Face

Hugging Face: Advancing AI through open collaboration. Platform for ML model collaboration and tools for AI project creation.

IT Services
51-250
Founded 2016
$395M raised

Description

  • Grow and nurture the open-source data and infrastructure community through initiatives, events, challenges, and collaborations.
  • Engage with communities such as Apache Parquet, Open Table Formats, and data engineering forums to promote best practices and Hugging Face tools.
  • Promote the Hugging Face Hub as a platform for data storage, versioning, and collaboration by showcasing datasets, benchmarks, and Xet use cases.
  • Create demos, benchmarks, and tools such as Colab notebooks to demonstrate best practices for data storage and versioning.
  • Experiment with Xet, Parquet, and other formats to illustrate efficient large-dataset updates, Parquet editing, and deduplication workflows.
  • Produce tutorials, blog posts, and videos that make complex technical topics accessible to developers.
  • Share insights on storage optimization, dataset versioning, and deduplication through content and community engagement.
  • Actively participate in Discord, GitHub, forums, and other online communities to answer questions and foster collaboration.
  • Ensure datasets and tools released on the Hub are well-documented with clear examples, benchmarks, and use cases.
  • Collaborate with Datasets, Hub, and Infrastructure teams to shape how developers interact with data on the platform.

Requirements

  • 3+ years of experience in developer relations or developer advocacy, ideally for data engineering, infrastructure, or ML tools and platforms.
  • An established public presence as a technical voice with a demonstrable, engaged audience on LinkedIn and X (Twitter).
  • A portfolio of developer-facing content such as tutorials, blog posts, videos, demos, benchmarks, or conference talks.
  • Hands-on experience building and engaging open-source or developer communities across Discord, GitHub, or forums.
  • Strong Python skills.
  • Hands-on experience with data libraries such as pandas, pyarrow, and huggingface/datasets.
  • Practical experience with storage systems and formats including Parquet, Open Table Formats, and S3.
  • Working knowledge of dataset versioning, deduplication, and compression.
  • Ability to explain complex technical topics clearly through writing, demos, or talks.
  • Fluent written and spoken English.
  • Experience with the Hugging Face Hub and datasets ecosystem, or with Xet, is preferred.
  • Open-source maintainer or contributor experience is preferred.
  • Familiarity with large-scale data pipelines and data engineering workflows is preferred.
  • Experience producing notebooks, such as Colab, for tutorials and benchmarks is preferred.
  • Applicants who do not meet every requirement are still encouraged to apply.

Benefits

  • Reimbursement for relevant conferences, training, and education.
  • Flexible working hours and remote work options.
  • Health, dental, and vision benefits for employees and their dependents.
  • Parental leave and flexible paid time off.
  • Company equity as part of the compensation package.
  • Opportunity to visit Hugging Face office spaces in NYC and Paris, if remote.
  • Workstation outfitting support, if needed.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Data Conversion Consultant

Accela 251-1K Internet Software & Services

Accela is hiring a Data Conversion Consultant to manage end-to-end data migration for the Accela Civic Platform, ensuring legacy data is analyzed, transformed, validated, and loaded accurately for client implementations.

Oracle SQL SQL Server
10 hours, 21 minutes ago

Senior Data Engineer

Exadel 1K-5K Internet Software & Services

Exadel is hiring a Senior Data Engineer to build and optimize data pipelines and analytics infrastructure for a vehicle lifecycle solutions client supporting ML-driven and business reporting use cases.

Apache Spark AWS CloudFormation DynamoDB Kafka MongoDB Python SQL Terraform
10 hours, 21 minutes ago

Data Engineer

Innodata 1K-5K IT Services

Innodata is seeking a Data Engineer to build enterprise data platforms for data center supply chain and real estate operations, supporting data-driven decisions and AI-enabled workflows at scale.

AWS ERP GCP Generative AI Looker MLOps Power BI Python SQL Tableau
10 hours, 36 minutes ago

Senior Data Engineer (Snowflake)

Exadel 1K-5K Internet Software & Services

Exadel is hiring a Senior Data Engineer (Snowflake) to support a U.S.-based education services client by building and maintaining scalable data platforms and analytics-ready pipelines for learning, enrollment, and academic operations.

Agile GitHub JIRA Power BI Salesforce Snowflake SQL Server
10 hours, 51 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers