Principal Data Architect – Streaming & Data Platforms*

1 month ago
Full-time
Lead
DevOps and Infrastructure
Egen.ai

Egen.ai

Egen.ai specializes in providing technology services that leverage cloud computing, data analytics, and artificial intelligence to enhance document intelligence and drive productivity and growth for its clients.

IT Services
Founded 2000

Description

  • Design and implement scalable streaming data platforms to support real-time ingestion, processing, and analytics.
  • Architect and guide development of end-to-end data platforms across batch and streaming workloads.
  • Lead and contribute to Master Data Management (MDM) solutions, including golden record design, data matching, survivorship, and hierarchy management.
  • Define and implement data governance frameworks covering data ownership/stewardship, data quality rules and monitoring, metadata, lineage, and access controls.
  • Collaborate with application teams to expose data via APIs and event-driven architectures.
  • Provide architectural guidance for cloud-native deployments, including containerization and orchestration.
  • Establish data architecture standards, patterns, and best practices and ensure engineering alignment on those standards.
  • Partner with DevOps teams to enable CI/CD, infrastructure automation, and platform reliability.
  • Review designs, mentor engineers, and drive technical decisions across projects.

Requirements

  • 8–12+ years in data engineering and architecture roles.
  • Strong experience designing streaming data platforms (Kafka, MSK, Kinesis, or similar).
  • Proven hands-on experience building or leading MDM implementations.
  • Solid understanding of data governance principles and real-world implementation.
  • Experience designing event-driven and API-first data architectures.
  • Deep knowledge of data modeling, including canonical models and domain-driven design concepts.
  • Ability to translate business requirements into scalable technical solutions.
  • Good communication skills.
  • Preferred: AWS experience (MSK, Kinesis, S3, Glue, Lambda, ECS, etc.).
  • Preferred: Experience with Java / Spring Boot for data and API services, containerization/orchestration using Docker and ECS, exposure to DevOps practices (CI/CD, IaC, Terraform, Git-based workflows), and cloud security/IAM best practices.

Benefits

  • $200,000–$220,000 annual salary range.
  • Comprehensive health insurance.
  • Paid leave (vacation/PTO) and paid holidays.
  • Parental leave and bereavement leave.
  • 401(k) employer match.
  • Employee referral bonuses.
  • Remote, full-time role with a competitive benefits package.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

[Job-00024] Senior Data Architect, Brazil

CI&T 5K-10K Internet Software & Services

Senior Data Architect at CI&T responsible for shaping and leading data architecture and technology vision across large-scale programs and strategic client accounts to design secure, scalable, high-impact data platforms and enable delivery teams to implement them.

Agile Apache Spark AWS Azure Databricks GCP Machine Learning System Design
1 month ago

Data Architect

Innovative Solutions 251-1K Internet Software & Services

Cloud Architect on the Professional Services team responsible for designing, migrating, and managing clients' AWS-based data-driven cloud architectures and scalable data platforms to enable analytics, data pipelines, and cloud-first solutions.

Apache Spark AWS Azure Cassandra Hadoop Java Kafka Keras Machine Learning MongoDB MySQL PostgreSQL Python PyTorch R Scala Scikit-learn SQL TensorFlow
1 month ago

Resident Solutions Architect-Public Sector

Databricks 1K-5K IT Services

Resident Solutions Architect (Professional Services) at Databricks working on short- to medium-term customer engagements to design, build, and productionize big data and AI solutions on the Databricks platform to drive customer adoption and value.

Apache Spark Azure CI/CD Databricks GCP MLflow MLOps Python Scala
1 month ago

Data & Semantic Model Architect

TetraScience 51-250 Biotechnology

Data & Semantic Model Architect at TetraScience to own and architect the Common Data Model and Semantic Layer of the Scientific Data and AI Cloud, enabling interoperable data exchange across customer environments to accelerate scientific insight.

Microservices
1 month ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers