RevStar

RevStar

RevStar Consulting is a client-centric cloud consulting firm that specializes in developing modern, user-focused, cloud-native web and mobile applications. They offer custom integrations, implementations, and solutions using the latest cloud technologi...

Internet Software & Services
51-250
Founded 2009

Description

  • Develop and optimize data pipelines using Apache Spark and Delta Lake within Databricks.
  • Implement ETL/ELT workflows for data ingestion, transformation, and storage.
  • Design scalable Lakehouse architecture solutions across structured and unstructured data sources.
  • Integrate Databricks with cloud storage platforms such as Azure Data Lake, AWS S3, and Google Cloud Storage.
  • Optimize Spark jobs for scalability, cost efficiency, and low latency.
  • Implement monitoring, alerting, automated validation, and data quality processes.
  • Support ML model training and deployment within Databricks using MLflow for tracking and versioning.
  • Collaborate with data scientists, ML engineers, architects, and business stakeholders to deliver aligned solutions.
  • Implement feature engineering pipelines and help integrate models into production environments.
  • Ensure data security, access control, governance, compliance, and documentation best practices.

Requirements

  • 3+ years of hands-on experience in data engineering with big data processing and cloud-native architectures.
  • 2+ years of hands-on experience with Databricks, including Apache Spark, Delta Lake, and MLflow.
  • Databricks Certified Data Engineer Associate or higher certification is mandatory.
  • Proficiency in Python, SQL, and Spark-based frameworks.
  • Experience developing and optimizing large-scale ETL/ELT pipelines.
  • Strong understanding of Lakehouse architecture and cloud-agnostic data solutions.
  • Familiarity with CI/CD pipelines and Infrastructure-as-Code tools for Databricks, such as Terraform and Databricks CLI.
  • Knowledge of data governance, security, and compliance best practices.
  • Experience working in Agile environments and following DevOps/MLOps best practices.
  • Preferred: additional Databricks certifications, such as Databricks Certified Machine Learning Associate.
  • Preferred: experience with real-time streaming tools such as Kafka, Kinesis, or Event Hub.
  • Preferred: familiarity with orchestration tools such as Apache Airflow or Prefect.
  • Preferred: background in AI/ML integration within Databricks, including feature engineering and model deployment.
  • Preferred: experience in client-facing roles or consulting environments.

Benefits

  • Paid time off.
  • Remote-first working environment.
  • Comprehensive health coverage including medical, dental, and vision.
  • 401(k) retirement plan.
  • Annual learning and development stipend for conferences, certifications, or courses.
  • Peer mentorship and coaching.
  • Professional growth opportunities with exposure to AWS GenAI, data, and cloud technologies.
  • Company outings and volunteer opportunities.
  • Collaborative, innovative culture.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Lead Data Platform Engineer

PR Newswire 1K-5K Internet Software & Services

INFOnline, part of saas.group, is seeking a Lead Data Platform Engineer to own and evolve its GCP-native data platform that powers digital audience measurement for the German and Austrian media industry.

CI/CD dbt Docker GCP Go Serverless SQL Terraform
3 hours, 23 minutes ago

OFSAA - Basel Technical Consultant

Unison Group Technology consulting

An experienced OFSAA Basel Technical Consultant is needed to design, develop, and support Basel regulatory reporting solutions for Oracle Financial Services Analytical Applications at a banking environment.

3 hours, 38 minutes ago

Data Engineer for AI Product

Qonto 1K-5K Banks

Qonto is hiring a Data Engineer for AI Product to build the data layer and production infrastructure that powers machine learning products for its finance workspace serving SMEs across Europe.

Apache Airflow Apache Spark CI/CD dbt Machine Learning Python
3 hours, 53 minutes ago

Senior Azure Data Consultant

Trility Consulting 51-250 Internet Software & Services

Trility Consulting is hiring a Senior Azure Data Consultant to work remotely with U.S. clients and lead data architecture and engineering efforts from initial discovery through production delivery.

Agile Azure CI/CD Databricks SQL
3 hours, 53 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers