Data Engineer (Python)

2 hours, 23 minutes ago
Full-time
Mid Level
Software Development
Orcrist Technologies

Orcrist Technologies

Orcrist Technologies specializes in providing advanced technology solutions, including data analytics, AI applications, and cybersecurity, aimed at empowering businesses to innovate and transform through the use of artificial intelligence and data-driv...

Internet Software & Services

Description

  • Prototype batch and streaming ingestion and connector patterns using NiFi, Kafka, Kafka Connect/Streams, and CDC approaches.
  • Design schemas and data models that are prototype-ready but easy to adopt and evolve.
  • Build incremental lakehouse datasets using Hudi, Iceberg, or Delta patterns and create queryable outputs for performance evaluation.
  • Incorporate data quality, provenance, metadata, and operability considerations early in the prototype process.
  • Containerize and deploy prototypes on Kubernetes and provide minimal runbooks and configuration for handoff.
  • Produce adoption artifacts including schemas, reference implementations, technical design notes, and an integration backlog.

Requirements

  • 3+ years of data engineering experience, with experience delivering real pipelines beyond ad hoc scripts.
  • Strong Python and SQL skills for transformations, validation tooling, and pipeline glue code.
  • Practical understanding of streaming and CDC concepts such as ordering, duplication, replay, and idempotency.
  • Experience with the Kafka ecosystem.
  • Familiarity with lakehouse and query/storage layers such as Hudi, Iceberg, Delta, Trino, Hive, or Postgres.
  • Comfort working in Kubernetes and container-based environments.
  • Ability to document technical decisions clearly.
  • Eligible to work in Germany; EU/NATO citizenship is preferred and export-control screening applies.
  • Experience with data quality tools such as Great Expectations is a plus.
  • Experience with metadata or lineage platforms such as OpenMetadata, DataHub, or Atlas is a plus.
  • Experience shipping solutions in on-prem or air-gapped environments is a plus.
  • German language skills at B1+ level and/or experience with OSINT, GEOINT, or multi-INT data shapes is a plus.

Benefits

  • Remote-first work in Germany with regular Berlin prototyping sprints.
  • 30 days of vacation.
  • Equipment and learning budget.
  • Opportunity to work with a modern stack including Kafka, NiFi, lakehouse technologies, distributed SQL, and Kubernetes.
  • High-leverage work where prototypes become reusable blueprints for multiple teams.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Data Engineer

Dropbox 1K-5K Internet Software & Services

Dropbox is seeking a Data Engineer for its Analytics Data Engineering team within Data Science & AI Platform to build new, large-scale analytics pipelines and data architecture from the ground up using modern big data technologies.

Apache Airflow Apache Spark C++ Databricks Java Python Scala SQL
8 minutes ago

Senior Scalability Engineer - Streaming & Realtime Systems

Capital Rx 251-1K Health Care Providers & Services

Judi Health is hiring a Senior Scalability Engineer to lead the design and expansion of remote streaming and real-time data infrastructure that powers critical healthcare workflows.

AWS Flask Grafana Kafka Microservices PostgreSQL Prometheus Python Rust Snowflake SQLAlchemy Terraform
53 minutes ago

Staff Data Platform Engineer

Typeform 251-1K Internet Software & Services

Typeform is hiring a Staff Data Platform Engineer to lead the architecture and ownership of its data and AI platform, supporting near-real-time pipelines, governance, and infrastructure for AI-driven products.

CI/CD GitOps HubSpot Kafka Machine Learning Microservices OpenTelemetry Python SQL Terraform
1 hour, 8 minutes ago

Senior Software Engineer II, Data Platform

instacart.careers 1K-5K Internet Software & Services

Instacart is hiring a Senior Software Engineer to shape the data platform behind its grocery delivery business, focusing on secure, scalable infrastructure that supports data use across teams and products.

Apache Airflow Apache Spark ClickHouse dbt Hadoop Hive Kafka PostgreSQL Python Ruby on Rails Scala Snowflake SQL
1 hour, 8 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers