Data Engineer (Python)

2 months ago
Full-time
Mid Level
Software Development
Orcrist Technologies

Orcrist Technologies

Orcrist Technologies specializes in providing advanced technology solutions, including data analytics, AI applications, and cybersecurity, aimed at empowering businesses to innovate and transform through the use of artificial intelligence and data-driv...

Internet Software & Services

Description

  • Prototype batch and streaming ingestion and connector patterns using NiFi, Kafka, Kafka Connect/Streams, and CDC approaches.
  • Design schemas and data models that are easy to adopt and support clear semantics and controlled evolution.
  • Build incremental lakehouse datasets using Hudi, Iceberg, or Delta patterns and create queryable outputs for performance and latency evaluation.
  • Incorporate data quality, provenance, metadata, and operability considerations early in prototype development.
  • Containerize and deploy prototypes on Kubernetes and provide minimal runbooks and configuration files to support handoff.
  • Produce adoption artifacts including schemas, reference implementations, technical design notes, and integration backlogs.
  • Generate credible performance and operability readouts for new data initiatives.
  • Collaborate with delivery or foundation teams to enable productization of prototypes.

Requirements

  • 3+ years of data engineering experience, with hands-on pipeline delivery beyond ad hoc scripts.
  • Strong Python and SQL skills for transformations, validation tooling, and pipeline glue code.
  • Practical knowledge of streaming and CDC concepts, including ordering, duplication, replay, and idempotency.
  • Experience with the Kafka ecosystem.
  • Familiarity with lakehouse and storage/query layers such as Hudi, Iceberg, Delta, Trino, Hive, or Postgres.
  • Comfort working in Kubernetes and container-based environments.
  • Ability to document technical decisions clearly.
  • Eligible to work in Germany; EU or NATO citizenship is preferred and export-control screening applies.
  • Experience with Great Expectations or similar data quality tools is preferred.
  • Experience with metadata and lineage platforms such as OpenMetadata, DataHub, or Atlas is preferred.
  • Experience shipping in on-prem or air-gapped environments is preferred.
  • Governance and policy awareness for regulated customers is preferred.
  • German language proficiency at B1+ level is preferred.
  • Experience with OSINT, GEOINT, or multi-INT data shapes is preferred.

Benefits

  • Remote-first work in Germany with regular Berlin prototyping sprints.
  • 30 days of vacation.
  • Equipment and learning budget.
  • Opportunity to work with a modern data stack including Kafka, NiFi, lakehouse technologies, distributed SQL, and Kubernetes.
  • High-leverage work where prototypes become blueprints reused and productized by multiple teams.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Staff Data Engineer

SmithRx 1K-5K Pharmaceuticals

SmithRx is seeking a Data Engineering leader to design and scale the data platforms that support pharmacy benefits analytics, governance, and machine learning in a fast-growing health-tech environment.

Apache Airflow Apache Spark C# C++ dbt Go Java LLM Looker Python Scala Snowflake SQL Superset
7 hours, 2 minutes ago

Data Engineer

Jenzabar 251-1K Internet Software & Services

The Data Engineer V at Jenzabar leads the design and optimization of scalable data pipelines and analytics platforms that support business insights across product, analytics, and engineering teams.

Agile Apache Spark Azure Databricks Git Power BI Python Scrum SQL SQL Server
8 hours, 17 minutes ago

[Job 29911] Mid/Senior Data Developer, Brazil

CI&T 5K-10K Internet Software & Services

CI&T is seeking a Mid/Senior Data Developer in Brazil to build and evolve data lake and analytics layers through reliable integrations, stable transformation pipelines, and governed market data organization in a remote-first environment with some on-site presence required for Campinas metro residents.

Apache Airflow AWS CI/CD CloudFormation dbt FastAPI Git GitHub Actions GitLab CI Grafana Kafka Kubernetes Microservices Pandas PostgreSQL Prometheus Python Snowflake SQL Terraform
8 hours, 17 minutes ago

Databricks Solution Architect

Bounteous 1K-5K Internet Software & Services

Bounteous is seeking a Lead Databricks Engineer/Architect to own the design and delivery of a cloud-based lakehouse data platform that supports analytics, data science, and machine learning at petabyte scale.

Apache Spark AWS Azure CI/CD Databricks GCP Git Kafka MLflow Python Scala SQL Terraform
8 hours, 17 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers