3Pillar Global

3Pillar Global

3Pillar Global is an innovative product development company that builds breakthrough software products to power digital businesses. They offer a range of services including product strategy, management, user experience design, and software engineering ...

Internet Software & Services
1K-5K
Founded 2006
$26M raised

Description

  • Build, test, and maintain batch and real-time data pipelines on Snowflake, PySpark, Delta Lake, and Kafka.
  • Implement data quality checks, schema validation, and alerting across pipeline stages.
  • Migrate legacy ETL and data warehouse systems to cloud-native AWS or Azure architectures.
  • Maintain CI/CD pipelines, including automated testing, deployment, rollback, and infrastructure as code using Terraform and GitHub Actions.
  • Build end-to-end retrieval infrastructure, including document ingestion, embedding pipelines, vector store management, and hybrid retrieval layers.
  • Implement chunking, metadata filtering, re-ranking, and retrieval tuning for precision, recall, and latency.
  • Maintain business entity mappings, ontologies, knowledge graphs, feature stores, and semantic data contracts.
  • Build ML data infrastructure for training curation, feature engineering, experiment tracking, and dataset versioning.
  • Support LLM fine-tuning workflows, including corpus curation, quality filtering, dataset formatting, and automated evaluation pipelines.
  • Implement agent-facing data APIs, tool schemas, memory or state stores, observability, and semantic query interfaces.

Requirements

  • 7+ years of experience in data engineering using cloud services.
  • 2+ years of experience with production AI/ML or LLM-era data infrastructure.
  • Proven experience building large-scale batch and streaming pipelines on Snowflake and AWS or Azure.
  • Deep expertise in Python, SQL, PySpark, Snowflake, Delta Lake, Kafka, and Spark Structured Streaming.
  • Hands-on experience with vector stores, embedding pipelines, and retrieval infrastructure in production RAG environments.
  • Working knowledge of MLOps tools and practices, including MLflow, CI/CD for AI, automated evaluation, and production monitoring.
  • Strong grounding in data governance, quality frameworks, and compliance-aligned engineering.
  • Experience with primary tools such as Databricks, AWS services including S3, Glue, Kinesis, EKS, and Redshift, Docker, Kubernetes, and GitHub Actions.
  • Secondary experience with LangChain, LlamaIndex, LLM APIs, Pinecone, FAISS, ChromaDB, OpenSearch, FastAPI, Neo4j, LangGraph, prompt engineering, RLHF dataset preparation, and LLM fine-tuning workflows.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Machine Learning Manager, Fincrime

Monzo 1K-5K Banks

Monzo is hiring a Machine Learning Manager to lead ML specialists in its Fraud team, developing controls that protect customers from financial crime while keeping the product smooth, fair and effective at scale.

Feature Engineering Machine Learning
5 hours, 20 minutes ago

ML Ops Engineer

Zeta Global 1K-5K Media

Zeta Global is hiring an ML Engineer / Data Scientist to develop and deploy machine learning solutions for its AI-powered marketing platform in a cloud environment, with a focus on turning data science into production impact.

Agile Apache Airflow Apache Spark AWS CI/CD dbt Deep Learning Docker Generative AI GitLab CI gRPC LLM Machine Learning MLflow Python PyTorch Scikit-learn SQL Statistics TensorFlow XGBoost
5 hours, 50 minutes ago

Senior MLOps Engineer

Wellhub 1-10 Gas Utilities

Wellhub is hiring a Senior MLOps Engineer in Brazil to help build and evolve the cloud-native ML platform that supports global AI development and deployment at scale.

Apache Spark AWS CI/CD Kubeflow Kubernetes Machine Learning MLOps Python Terraform
5 hours, 50 minutes ago

Machine Learning Engineer, Simulation Evaluation

Waymo Autonomous vehicles, robotics, AI, ride-hailing / mobility tech

Waymo is hiring a machine learning engineer or researcher to advance simulator evaluation for autonomous driving by building systems that measure the realism of multimodal world models and generative simulation.

C++ Computer Vision Deep Learning Generative AI Machine Learning Python PyTorch
5 hours, 50 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers