Data/Infrastructure Advocate Engineer - EMEA Remote

1 month ago
Full-time
Senior
DevOps and Infrastructure
Hugging Face

Hugging Face

Hugging Face: Advancing AI through open collaboration. Platform for ML model collaboration and tools for AI project creation.

IT Services
51-250
Founded 2016
$395M raised

Description

  • Grow and nurture the open-source data and infrastructure community by launching initiatives, collaborating with data-focused groups, and organizing events or challenges.
  • Engage with external communities (e.g., Apache Parquet, Open Table Formats, data engineering forums) to promote best practices and Hugging Face tools.
  • Promote the Hugging Face Hub as the go-to platform for data storage, versioning, and collaboration by curating and showcasing datasets, benchmarks, and tools like Xet.
  • Create demos, benchmarks, tools, and example notebooks (e.g., Colab) to illustrate best practices for data storage, versioning, and pipeline optimization.
  • Experiment with Xet, Parquet, and other data formats to demonstrate their potential for machine learning and data engineering workflows.
  • Produce high-quality technical content (tutorials, blog posts, videos) that makes complex topics accessible to developers and data engineers.
  • Share insights and guidance on storage optimization, dataset versioning, deduplication, and related workflows to empower users.
  • Actively participate in online communities (Discord, GitHub, forums) to highlight contributions, answer questions, and foster collaboration.
  • Collaborate cross-functionally with teams like Datasets, Hub, and Infrastructure to shape how developers interact with data on the platform and ensure released datasets/tools are well-documented with clear examples and benchmarks.

Requirements

  • Strong technical experience with Python and data libraries such as pandas, pyarrow, and huggingface/datasets.
  • Familiarity with storage systems and formats including Parquet, Open Table Formats, and object storage like S3.
  • Hands-on experience building and experimenting with data tools, storage optimization, and dataset versioning.
  • Ability to clearly explain complex topics (e.g., deduplication, compression, Parquet editing) through writing, demos, or talks.
  • Active participation in developer and open-source communities (GitHub, Discord, forums) and a passion for knowledge sharing.
  • Comfort working in fast-moving environments and building in public to inspire others.
  • Experience creating demos, benchmarks, tutorials, or example notebooks to illustrate technical workflows.
  • Interest in advocating for platform adoption and collaborating with product and infrastructure teams to shape developer workflows.

Benefits

  • Flexible working hours and remote work options, with office spaces available in NYC and Paris.
  • Health, dental, and vision benefits for employees and their dependents.
  • Parental leave and flexible paid time off.
  • Reimbursement for relevant conferences, training, and education.
  • Company equity included as part of the compensation package.
  • Support for remote employees to visit offices and provision of workstation equipment if needed.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Data Engineering Tech Lead

Lingaro 5K-10K IT Services

Data Engineering Tech Lead at Lingaro (Data Engineering & Management) — lead a Poland-based remote/full-time team to design, deliver, and maintain scalable, secure data engineering solutions while mentoring engineers and ensuring timely, high-quality project delivery.

Azure CI/CD Python Scala SQL
14 hours, 40 minutes ago

Senior Software Engineer - Data Integration & JVM Ecosystem

ClickHouse 51-250 IT Services

Senior Software Engineer (JVM) at ClickHouse joining the Connectors team to own and maintain JVM-based data framework integrations, connectors, and drivers that enable high-performance data ingestion and a seamless developer experience for data engineering workloads.

Apache Airflow Apache Spark ClickHouse dbt Grafana HTTP Java Kafka Metabase Pandas Power BI Python SQL Tableau TCP/IP
1 month ago

Junior Data Engineer (Remote Argentina) / Ingénieur données junior (à distance)

GlobalVision 51-250 Internet Software & Services

Junior Data Engineer at GlobalVision supporting and maintaining the company’s data infrastructure to ensure reliable, accessible, and actionable data that informs business decision-making across the organization.

dbt Domo Machine Learning Power BI Python Salesforce SQL Tableau
1 month ago

Associate Software Engineer - Data Engineer

GroundTruth 251-1K Media

GroundTruth is hiring a Data Engineering Associate Software Engineer on the Attribution Team to build and maintain scalable data pipelines and infrastructure that enable accurate, real-world ad attribution and analytics.

Apache Airflow Apache Spark AWS Docker Git Hadoop Java Looker MapReduce Python REST API Shell Scripting SQL
1 month ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers