Data/Infrastructure Advocate Engineer - EMEA Remote

1 month, 2 weeks ago
Full-time
Senior
DevOps and Infrastructure
Hugging Face

Hugging Face

Hugging Face: Advancing AI through open collaboration. Platform for ML model collaboration and tools for AI project creation.

IT Services
51-250
Founded 2016
$395M raised

Description

  • Grow and nurture the open-source data and infrastructure community by launching initiatives, collaborating with data-focused groups, and organizing events or challenges.
  • Engage with external communities (e.g., Apache Parquet, Open Table Formats, data engineering forums) to promote best practices and Hugging Face tools.
  • Promote the Hugging Face Hub as the go-to platform for data storage, versioning, and collaboration by curating and showcasing datasets, benchmarks, and tools like Xet.
  • Create demos, benchmarks, tools, and example notebooks (e.g., Colab) to illustrate best practices for data storage, versioning, and pipeline optimization.
  • Experiment with Xet, Parquet, and other data formats to demonstrate their potential for machine learning and data engineering workflows.
  • Produce high-quality technical content (tutorials, blog posts, videos) that makes complex topics accessible to developers and data engineers.
  • Share insights and guidance on storage optimization, dataset versioning, deduplication, and related workflows to empower users.
  • Actively participate in online communities (Discord, GitHub, forums) to highlight contributions, answer questions, and foster collaboration.
  • Collaborate cross-functionally with teams like Datasets, Hub, and Infrastructure to shape how developers interact with data on the platform and ensure released datasets/tools are well-documented with clear examples and benchmarks.

Requirements

  • Strong technical experience with Python and data libraries such as pandas, pyarrow, and huggingface/datasets.
  • Familiarity with storage systems and formats including Parquet, Open Table Formats, and object storage like S3.
  • Hands-on experience building and experimenting with data tools, storage optimization, and dataset versioning.
  • Ability to clearly explain complex topics (e.g., deduplication, compression, Parquet editing) through writing, demos, or talks.
  • Active participation in developer and open-source communities (GitHub, Discord, forums) and a passion for knowledge sharing.
  • Comfort working in fast-moving environments and building in public to inspire others.
  • Experience creating demos, benchmarks, tutorials, or example notebooks to illustrate technical workflows.
  • Interest in advocating for platform adoption and collaborating with product and infrastructure teams to shape developer workflows.

Benefits

  • Flexible working hours and remote work options, with office spaces available in NYC and Paris.
  • Health, dental, and vision benefits for employees and their dependents.
  • Parental leave and flexible paid time off.
  • Reimbursement for relevant conferences, training, and education.
  • Company equity included as part of the compensation package.
  • Support for remote employees to visit offices and provision of workstation equipment if needed.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

AI Data Engineer

Influur 11-50 Media

Influur is hiring an AI Data Engineer in New York/remote to own the full data-to-agent pipeline behind its autonomous viral marketing system for influencer campaigns.

AWS GCP LLM Python
4 hours, 44 minutes ago

Senior Data Engineer

Zencore Group 11-50 Internet Software & Services

Zencore is hiring a Senior Data Engineer in its LATAM Data & Analytics team to help customers modernize and migrate data platforms on Google Cloud through hands-on pipeline engineering and advisory work.

Apache Airflow Apache Spark CI/CD Databricks GCP MLOps Oracle Python Snowflake SQL
5 hours, 29 minutes ago

Data Observability Consultant - Dynatrace

Lingaro 5K-10K IT Services

Dynatrace India’s Consulting and Advisory Data Consulting Practice is hiring a remote Data Observability Consultant to support data-focused consulting work.

5 hours, 44 minutes ago

Senior Data Engineer

Lodgify 251-1K Internet Software & Services

Lodgify is hiring a Senior Data Engineer in Barcelona to build and optimize the company’s modern data platform that powers data-driven decisions across its vacation rental business.

Apache Airflow AWS Azure dbt GCP JavaScript Machine Learning Python SQL
5 hours, 44 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers