Brooksource

Lead Data Engineer – AI Data Products

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Lead Data Engineer – AI Data Products, offering a contract-to-permanent position. It is 100% remote with a pay rate of "unknown". Key skills include Databricks, Spark, PySpark, and CI/CD. Requires strong data pipeline experience and technical leadership.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

May 14, 2026

🕒 - Duration

Unknown

🏝️ - Location

Remote

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

St Louis, MO

🧠 - Skills detailed

#Code Reviews #GitHub #AI (Artificial Intelligence) #Data Warehouse #DataOps #Data Pipeline #Spark (Apache Spark) #Leadership #Scripting #Linux #GIT #PySpark #Automation #Distributed Computing #Python #"ETL (Extract #Transform #Load)" #SQL (Structured Query Language) #ML (Machine Learning) #Data Analysis #GitLab #Scala #Databricks #Data Processing #Hadoop #DevOps #Data Engineering #Batch

Role description

Lead Data Engineer – AI Data Products Contract-to-Permanent Hire 100% Remote (8AM-5PM CST) Our Fortune 50 healthcare client’s AI/ML platforms group is seeking a modern Lead Data Engineer to provide technical leadership and delivery oversight across multiple AI data products within their enterprise AI Hub. This role is primarily focused on technical direction, architectural guidance, and team leadership (~75%), while remaining hands-on (~25%) in building scalable data pipelines, CI/CD automation, and AI-enabling data assets across multiple concurrent initiatives. Responsibilities: • Provide technical leadership across multiple AI Data Product initiatives and engineering workstreams. • Understand and clarify technical requirements, recommend architecture/design elements, and set overall technical direction across projects. • Design, implement, and maintain scalable ETL/ELT pipelines and distributed data workflows using Databricks/Spark technologies. • Implement and optimize CI/CD pipelines, data operations workflows, and cost management strategies across the data platform. • Build and support AI-enabling data assets such as vector stores, feature tables, Genie Rooms, and semantic AI context assets, while ensuring integration into model development workflows. • Partner with AI/ML, analytics, platform, and business teams to deliver production-grade data solutions. • Support platform visibility by delivering operational insights into platform utilization, cost trends, and financial operations. • Oversee and support Junior-Senior Engineers through POCs, technical guidance, troubleshooting, and code reviews. Requirements: • Strong hands-on experience with Databricks Data Engineering and Spark distributed computing. Hadoop ecosystem experience is a plus. • PySpark and Python expertise for large-scale data processing. • Strong SQL skills and experience with data warehouses and data analysis. • Hands-on experience building data pipelines (batch and streaming). • Experience working with columnar data formats (Parquet, Delta). • Experience with DevOps practices, CI/CD pipeline development, and Git workflows (GitHub/GitLab). • Familiarity with Linux scripting fundamentals (for pipeline and CI/CD automation). • Exposure to emerging AI data infrastructure, such as building vector stores and applying DataOps / MLOps practices. • Technical leadership across multiple concurrent projects, providing architectural guidance, defining technical work, and setting technical direction. • US Citizens & Green Card holders only •

Apply now Apply with DFH

← See all roles

Go to role

ConsultNet Technology Services and Solutions

is hiring for a:

Brooksource

Lead Data Engineer – AI Data Products

Big Data Cloud Operations-Night Shift

Associate Scientific Informatics Consultant

Senior MS Access Database Developer

Senior Data Engineer - Remote

Book a

chat

with us

Company