Senior Data Engineer – Real-Time ML Infrastructure

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Senior Data Engineer focused on Real-Time ML Infrastructure, lasting 19 months, paying up to $92.50/hr. Requires expertise in Spark or Flink, real-time streaming, SQL, AWS, and Python/Java/Scala, with 5+ years of relevant experience.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

736

🗓️ - Date discovered

July 2, 2025

🕒 - Project duration

More than 6 months

🏝️ - Location type

On-site

📄 - Contract type

W2 Contractor

🔒 - Security clearance

Unknown

📍 - Location detailed

Los Angeles, CA

🧠 - Skills detailed

#Spark (Apache Spark) #AWS (Amazon Web Services) #Data Pipeline #Programming #MLflow #Observability #Kafka (Apache Kafka) #Computer Science #Datasets #Data Engineering #ML (Machine Learning) #Java #Python #Metadata #Batch #Data Framework #Data Quality #Monitoring #Data Lake #Cloud #Code Reviews #Scala #SQL (Structured Query Language) #Schema Design

Role description

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

Item 1
Item 2
Item 3

Unordered list

Item A
Item B
Item C

Text link

Bold text

Emphasis

^Superscript

_Subscript

City: Seattle, WA/ Glendale, CA/ Burbank, CA/ Santa Monica, CA Onsite/ Hybrid/ Remote: Onsite (4 days a week onsite) Duration: 19 months Rate Range: Up to$92.50/hr on W2 depending on experience (no C2C or 1099 or sub-contract) Work Authorization: GC, USC, All valid EADs except OPT, CPT, H1B Must Have: • Spark or Flink or Beam or Kafka Streams • Real-time streaming pipeline experience • SQL and large-scale schema design • AWS or equivalent cloud platforms • Python, Java, or Scala • ML workflow support (feature engineering, data validation, etc.) Responsibilities: • Design, build, and maintain scalable batch and real-time data pipelines for user interaction data, metadata, and model features • Develop and operate offline and real-time feature stores to support ML inference and training workflows • Partner with ML engineers to define data schemas, validation logic, and build production-ready datasets • Implement monitoring and observability for data quality and pipeline reliability • Optimize data workflows for speed, cost, and scalability across large-scale datasets • Translate personalization requirements into robust data infrastructure solutions • Participate in selection and adoption of modern data tooling and cloud-native technologies • Contribute to team’s technical excellence through code reviews and design discussions Qualifications: • Bachelor’s or Master’s in Computer Science, Data Engineering, or related technical field • 5+ years of experience building production-grade distributed data systems • Expertise in modern data frameworks (Spark, Flink, Beam, Kafka Streams) • Proficient in SQL and schema design for high-scale data environments • Strong programming skills in Python, Java, or Scala • Experience with cloud platforms such as AWS and infrastructure components like data lakes, warehouses, and feature stores • Proven ability to support machine learning data workflows and pipelines Preferred: • Experience in building ML infrastructure for personalization or recommendation systems • Familiarity with MLOps tools (MLflow, TFX, Kubeflow) • Hands-on knowledge of real-time ML serving and online feature generation • Prior work in early-stage or 0→1 product development environments • Strong cross-functional collaboration with ML and product teams

Apply now Apply with DFH Sign up

← See all roles