Data Engineering Intern(Spring/Summer 2026)

19 hours, 15 minutes ago
Internship
Entry Level
Artificial Intelligence and Machine Learning
Ladders

Ladders

TheLadders is a high-paying job marketplace connecting professionals to $100K+ opportunities, with a focus on career advancement and growth.

Professional Services
51-250
Founded 2003

Description

  • Support the development and maintenance of data pipelines using Databricks, Spark, and similar technologies.
  • Write and optimize SQL and Python scripts for data transformation, integration, and automation tasks.
  • Develop automation scripts that populate metadata and comments across Databricks tables using structured definitions such as CSV files.
  • Assist in building a proof-of-concept for an automated data dictionary maintained using existing Databricks metadata.
  • Contribute to prototyping an AI-powered knowledge agent that uses internal data and documentation to answer common questions.
  • Collaborate with team members to improve data quality, cataloging, and metadata management across the ecosystem.
  • Participate in code reviews, design discussions, and sprint ceremonies to learn engineering best practices.
  • Document findings, workflows, and automation processes for future reuse.
  • Perform other duties as assigned.

Requirements

  • Actively pursuing a Bachelor’s or Master’s degree in Computer Science, Software Engineering, Information Systems, or a related technical field.
  • Foundational knowledge of Python and SQL for data manipulation and analysis.
  • Familiarity with ETL concepts and structured data formats such as CSV, JSON, and Parquet.
  • Interest in cloud-based data platforms, with Azure preferred.
  • Strong analytical and problem-solving skills with an eagerness to learn.
  • Effective communication and teamwork skills.
  • Exposure to Databricks, Apache Spark, or other distributed data frameworks is preferred.
  • Familiarity with Git or version control practices is preferred.
  • Interest in AI/LLM-based automation, data documentation, or metadata management is preferred.
  • Prior project or internship experience in data engineering or cloud technologies is preferred.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Database Reliability Engineer

Sporty Group 51-250 Media

Sporty is seeking a Database Reliability Engineer to own and improve its database infrastructure supporting multiple platforms and international expansion.

Ansible Argo CD Elasticsearch GitHub Actions Go Grafana Helm Jenkins Kubernetes MongoDB MySQL PostgreSQL Prometheus Python RabbitMQ Terraform
10 hours, 30 minutes ago

GTM Engineer

Redwood 251-1K Internet Software & Services

Redwood Software is hiring a GTM Engineer to design and optimize AI-driven marketing automation, integrations, and workflows that improve efficiency, governance, and campaign performance.

Apache Airflow GPT GraphQL JavaScript JSON Python REST API Salesforce SQL
11 hours, 30 minutes ago

Clinical Data Manager

RefinedScience 11-50 research

RefinedScience is hiring a Remote Clinical Data Manager to support interdisciplinary cancer and disease research by abstracting, validating, and managing clinical data from medical records.

Agile SQL
11 hours, 30 minutes ago

Sr. Manager, AI Forward Deployed Engineering (FDE)

Databricks 1K-5K IT Services

Databricks is seeking a remote AI Forward Deployed Engineering leader to grow a customer-facing professional services team that helps enterprise clients build and productionize GenAI applications.

Apache Spark AWS Azure Databricks GCP Generative AI Google Tag Manager Machine Learning MLflow
11 hours, 45 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers