RevStar

RevStar

RevStar Consulting is a client-centric cloud consulting firm that specializes in developing modern, user-focused, cloud-native web and mobile applications. They offer custom integrations, implementations, and solutions using the latest cloud technologi...

Internet Software & Services
51-250
Founded 2009

Description

  • Develop and optimize data pipelines using Apache Spark and Delta Lake within Databricks.
  • Implement ETL/ELT workflows for data ingestion, transformation, and storage.
  • Design scalable Lakehouse architecture solutions across structured and unstructured data sources.
  • Integrate Databricks with cloud storage platforms such as Azure Data Lake, AWS S3, and Google Cloud Storage.
  • Optimize Spark jobs for scalability, cost efficiency, and low latency.
  • Implement monitoring, alerting, automated validation, and data quality processes.
  • Support ML model training and deployment within Databricks using MLflow for tracking and versioning.
  • Collaborate with data scientists, ML engineers, architects, and business stakeholders to deliver aligned solutions.
  • Implement feature engineering pipelines and help integrate models into production environments.
  • Ensure data security, access control, governance, compliance, and documentation best practices.

Requirements

  • 3+ years of hands-on experience in data engineering with big data processing and cloud-native architectures.
  • 2+ years of hands-on experience with Databricks, including Apache Spark, Delta Lake, and MLflow.
  • Databricks Certified Data Engineer Associate or higher certification is mandatory.
  • Proficiency in Python, SQL, and Spark-based frameworks.
  • Experience developing and optimizing large-scale ETL/ELT pipelines.
  • Strong understanding of Lakehouse architecture and cloud-agnostic data solutions.
  • Familiarity with CI/CD pipelines and Infrastructure-as-Code tools for Databricks, such as Terraform and Databricks CLI.
  • Knowledge of data governance, security, and compliance best practices.
  • Experience working in Agile environments and following DevOps/MLOps best practices.
  • Preferred: additional Databricks certifications, such as Databricks Certified Machine Learning Associate.
  • Preferred: experience with real-time streaming tools such as Kafka, Kinesis, or Event Hub.
  • Preferred: familiarity with orchestration tools such as Apache Airflow or Prefect.
  • Preferred: background in AI/ML integration within Databricks, including feature engineering and model deployment.
  • Preferred: experience in client-facing roles or consulting environments.

Benefits

  • Paid time off.
  • Remote-first working environment.
  • Comprehensive health coverage including medical, dental, and vision.
  • 401(k) retirement plan.
  • Annual learning and development stipend for conferences, certifications, or courses.
  • Peer mentorship and coaching.
  • Professional growth opportunities with exposure to AWS GenAI, data, and cloud technologies.
  • Company outings and volunteer opportunities.
  • Collaborative, innovative culture.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Data Engineer

Knak 51-250 Internet Software & Services

Knak is hiring its first Data Engineer to architect the governed Snowflake data layer that powers company-wide self-serve analytics, AI agents, and department-specific data access.

AWS Databricks dbt GCP Git LinkedIn Ads Looker Mixpanel Mode MySQL Pandas Python Salesforce Snowflake SQL Tableau
1 hour, 30 minutes ago

SAP BW Lead

Lingaro 5K-10K IT Services

SAP BW Lead for Poland’s CC Data Engineering & Management team at SAP, responsible for leading SAP BW-related data engineering work in a full-time remote role.

2 hours, 44 minutes ago

Senior Data Engineer

Egen.ai IT Services

Egen is seeking a Senior Data Engineer to build scalable, client-facing data platforms and API integrations on Google Cloud, with a focus on healthcare data solutions.

Apache Airflow AWS dbt GCP JSON Python REST API Salesforce SQL
3 hours, 24 minutes ago

Data Engineering Team Lead (Agentic Search)

Nebius 51-250 Internet Software & Services

Nebius is seeking a Data Engineering Team Lead to own the data platform supporting its agent-native search product, spanning ingestion, warehouse architecture, analytics, and trustworthy datasets for product and business decisions.

Apache Airflow Apache Spark AWS dbt GCP Kafka MapReduce Python Snowflake SQL
3 hours, 27 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers