Staff Data Engineer

2 weeks, 6 days ago
Full-time
Lead
Software Development
Robots & Pencils

Robots & Pencils

Robots & Pencils is a digital innovation firm that assists organizations in leveraging mobile, web, and AI technologies to modernize their operations and create efficient, user-centered solutions that enhance productivity and decision-making.

IT Services
51-250
Founded 2009

Description

  • Define data architecture and platform strategy across pipelines, warehouses, and data lakes.
  • Build and optimize scalable data pipelines for batch and real-time processing.
  • Define and enforce data governance, quality standards, and compliance frameworks.
  • Build monitoring, logging, and alerting for data pipelines and services, and contribute to CI/CD workflows.
  • Drive data platform modernization with a focus on performance, cost, and scalability.
  • Design and implement data contracts and event flows with backend, platform, and engineering teams.
  • Lead the design and implementation of data pipelines for production AI/ML systems, including embeddings, vector stores, RAG preparation, feature stores, and training/inference data flows.
  • Integrate data services with APIs, middleware, and third-party systems to support end-to-end data consumption.
  • Partner with leadership on data strategy and translate technical decisions into actionable direction.
  • Collaborate with engineering, analytics, AI, and product teams to align data platforms with broader goals.
  • Mentor junior and mid-level engineers and establish standards that improve team-wide quality and consistency.

Requirements

  • 7+ years of professional data engineering experience, including leadership on complex data platform initiatives.
  • Strong system architecture background with expertise in distributed data systems.
  • Expert proficiency in Python, Scala, and SQL.
  • Deep experience with cloud-native data platforms and enterprise data warehousing.
  • Strong expertise in data pipeline orchestration and processing.
  • Strong experience with streaming platforms and real-time data processing, such as Kafka, Kinesis, or Pub/Sub.
  • Strong data modeling expertise and experience with data transformation.
  • Strong experience with data quality, governance, and compliance frameworks.
  • Strong experience with container orchestration and CI/CD for data systems.
  • Strong experience building data pipelines for production AI/ML systems, including embeddings, vector stores, RAG data preparation, feature stores, and training/inference data flows.
  • Demonstrated leadership and technical mentoring experience across a team or organization.
  • Strong stakeholder communication skills and the ability to translate technical depth across audiences.
  • Demonstrable day-to-day usage and expert knowledge of AI-forward coding tools such as Claude and Cursor.
  • Excellent problem-solving skills and the ability to navigate highly ambiguous technical and business challenges.
  • Experience with data mesh or data fabric concepts, lakehouse architectures, or governance framework implementation is a plus.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior AI & Data Engineer

Orion Innovation 1K-5K IT Services

Orion Innovation is seeking a Web Delivery Lead to drive end-to-end delivery of AEM-based global websites for its digital engagement platforms, with a focus on reliable launches, governance, and AI-enabled web operations.

Agile AWS Azure CDN CI/CD Computer Vision DNS Docker GCP GPT GraphQL Hugging Face Java JavaScript Kubernetes LLM NLP Python PyTorch SEO TensorFlow
6 hours, 34 minutes ago

Data Engineer

Innodata 1K-5K IT Services

Innodata is seeking a Data Engineer to build enterprise data warehouses, data lakes, and pipelines that support data center supply chain and real estate operations, while enabling AI-driven analytics and workflow automation.

AWS ERP GCP Looker MLOps Power BI Python SQL Tableau
6 hours, 49 minutes ago

Data Engineer

Pavago IT Services

Pavago is hiring a remote Data Engineer to build and maintain cloud-based data pipelines, warehouses, and analytics datasets that support reporting, automation, and business intelligence.

Apache Airflow AWS CI/CD Dagster dbt Docker GCP GitHub Actions GitLab CI Jenkins Kafka Kubernetes Looker Luigi Power BI Prefect Python Scala Snowflake SQL Tableau Terraform
7 hours, 4 minutes ago

Senior Data Engineer - Surveillance & Interoperability

inventYOU 1-10 Internet Software & Services

Senior Data Engineer - Surveillance & Interoperability at inventYOU, responsible for designing and delivering large-scale data integration and interoperability solutions that support secure information exchange, surveillance, monitoring, and analytics across complex systems.

AWS Azure GCP Python REST API SQL
7 hours, 19 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers