Hugging Face

Hugging Face: Advancing AI through open collaboration. Platform for ML model collaboration and tools for AI project creation.

IT Services

Information Technology

51-250 (170)

Founded 2016

$395M raised

7 open positions

Links

View All Jobs

Data/Infrastructure Advocate Engineer - EMEA Remote

2 months, 1 week ago

France, Europe, Middle East, Africa

Full-time

Senior

Data Engineer

DevOps and Infrastructure

AWS GitHub Pandas Python

Apply Now

Hugging Face

Hugging Face: Advancing AI through open collaboration. Platform for ML model collaboration and tools for AI project creation.

IT Services

51-250

Founded 2016

$395M raised

View All Jobs 7

Description

Grow and nurture the open-source data and infrastructure community by launching initiatives, collaborating with data-focused groups, and organizing events or challenges.
Engage with external communities (e.g., Apache Parquet, Open Table Formats, data engineering forums) to promote best practices and Hugging Face tools.
Promote the Hugging Face Hub as the go-to platform for data storage, versioning, and collaboration by curating and showcasing datasets, benchmarks, and tools like Xet.
Create demos, benchmarks, tools, and example notebooks (e.g., Colab) to illustrate best practices for data storage, versioning, and pipeline optimization.
Experiment with Xet, Parquet, and other data formats to demonstrate their potential for machine learning and data engineering workflows.
Produce high-quality technical content (tutorials, blog posts, videos) that makes complex topics accessible to developers and data engineers.
Share insights and guidance on storage optimization, dataset versioning, deduplication, and related workflows to empower users.
Actively participate in online communities (Discord, GitHub, forums) to highlight contributions, answer questions, and foster collaboration.
Collaborate cross-functionally with teams like Datasets, Hub, and Infrastructure to shape how developers interact with data on the platform and ensure released datasets/tools are well-documented with clear examples and benchmarks.

Requirements

Strong technical experience with Python and data libraries such as pandas, pyarrow, and huggingface/datasets.
Familiarity with storage systems and formats including Parquet, Open Table Formats, and object storage like S3.
Hands-on experience building and experimenting with data tools, storage optimization, and dataset versioning.
Ability to clearly explain complex topics (e.g., deduplication, compression, Parquet editing) through writing, demos, or talks.
Active participation in developer and open-source communities (GitHub, Discord, forums) and a passion for knowledge sharing.
Comfort working in fast-moving environments and building in public to inspire others.
Experience creating demos, benchmarks, tutorials, or example notebooks to illustrate technical workflows.
Interest in advocating for platform adoption and collaborating with product and infrastructure teams to shape developer workflows.

Benefits

Flexible working hours and remote work options, with office spaces available in NYC and Paris.
Health, dental, and vision benefits for employees and their dependents.
Parental leave and flexible paid time off.
Reimbursement for relevant conferences, training, and education.
Company equity included as part of the compensation package.
Support for remote employees to visit offices and provision of workstation equipment if needed.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Samsara is hiring a remote Data Engineer II to build and scale the Databricks-based data platforms that power its Revenue Operations AI and data infrastructure for GTM analytics and generative AI applications.

United States Full-time Junior Data Engineer

$102k-$154k

Apache Spark AWS Databricks dbt Generative AI Machine Learning Python Salesforce Snowflake SQL

1 hour, 41 minutes ago

Apply

1 hour, 41 minutes ago

Synthetic Data Engineer (AI Data/Training)

Hyphen Connect 1-10 staffing & recruiting

A Synthetic Data Engineer at the organization will design and manage domain-specific synthetic data pipelines that support data processing and model training workflows.

China Mid Level AI (Artificial Intelligence) Data Engineer

Apache Airflow Apache Spark

2 hours, 18 minutes ago

Apply

2 hours, 18 minutes ago

Senior Developer / Systems & ETL Engineer

Metova 51-250 Internet Software & Services

Senior Developer / Systems & ETL Engineer at an unnamed company, responsible for building end-to-end information processing systems that span ETL, APIs, cloud-native deployment, and client-facing technical delivery.

Chile Ecuador Peru Mexico Argentina Contract Senior Data Engineer

ActiveMQ AWS Azure C CI/CD Docker Hadoop Java Kubernetes Linux Microservices MySQL Oracle OWASP Perl PostgreSQL Python RabbitMQ REST API Snowflake Spring Boot SQL SQL Server Unix

2 hours, 42 minutes ago

Apply

2 hours, 42 minutes ago

INGENIERO DE DATOS

NEORIS 5K-10K Internet Software & Services

NEORIS busca un Data Engineer para diseñar, desarrollar y desplegar soluciones de datos en un entorno Big Data y Cloud, alineadas con la arquitectura de datos y orientadas a eficiencia y mantenibilidad.

Ecuador Full-time Mid Level Data Engineer

Agile Apache Spark AWS Azure Cassandra Elasticsearch GCP Hadoop HDFS MongoDB Neo4j Oracle PostgreSQL Python SQL Server

3 hours, 7 minutes ago

Apply

3 hours, 7 minutes ago

Hugging Face

Tags

Links

Data/Infrastructure Advocate Engineer - EMEA Remote

Hugging Face

Description

Requirements

Benefits

Similar Roles

Data Engineer II

Synthetic Data Engineer (AI Data/Training)

Senior Developer / Systems & ETL Engineer

INGENIERO DE DATOS

You're on a roll! Sign up now to keep applying.