

Senior Data Engineer – Real-Time ML Infrastructure
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Engineer focused on Real-Time ML Infrastructure, lasting 19 months, paying up to $92.50/hr. Requires expertise in Spark or Flink, real-time streaming, SQL, AWS, and Python/Java/Scala, with 5+ years of relevant experience.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
736
-
🗓️ - Date discovered
July 2, 2025
🕒 - Project duration
More than 6 months
-
🏝️ - Location type
On-site
-
📄 - Contract type
W2 Contractor
-
🔒 - Security clearance
Unknown
-
📍 - Location detailed
Los Angeles, CA
-
🧠 - Skills detailed
#Spark (Apache Spark) #AWS (Amazon Web Services) #Data Pipeline #Programming #MLflow #Observability #Kafka (Apache Kafka) #Computer Science #Datasets #Data Engineering #ML (Machine Learning) #Java #Python #Metadata #Batch #Data Framework #Data Quality #Monitoring #Data Lake #Cloud #Code Reviews #Scala #SQL (Structured Query Language) #Schema Design
Role description
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
City: Seattle, WA/ Glendale, CA/ Burbank, CA/ Santa Monica, CA
Onsite/ Hybrid/ Remote: Onsite (4 days a week onsite)
Duration: 19 months
Rate Range: Up to$92.50/hr on W2 depending on experience (no C2C or 1099 or sub-contract)
Work Authorization: GC, USC, All valid EADs except OPT, CPT, H1B
Must Have:
• Spark or Flink or Beam or Kafka Streams
• Real-time streaming pipeline experience
• SQL and large-scale schema design
• AWS or equivalent cloud platforms
• Python, Java, or Scala
• ML workflow support (feature engineering, data validation, etc.)
Responsibilities:
• Design, build, and maintain scalable batch and real-time data pipelines for user interaction data, metadata, and model features
• Develop and operate offline and real-time feature stores to support ML inference and training workflows
• Partner with ML engineers to define data schemas, validation logic, and build production-ready datasets
• Implement monitoring and observability for data quality and pipeline reliability
• Optimize data workflows for speed, cost, and scalability across large-scale datasets
• Translate personalization requirements into robust data infrastructure solutions
• Participate in selection and adoption of modern data tooling and cloud-native technologies
• Contribute to team’s technical excellence through code reviews and design discussions
Qualifications:
• Bachelor’s or Master’s in Computer Science, Data Engineering, or related technical field
• 5+ years of experience building production-grade distributed data systems
• Expertise in modern data frameworks (Spark, Flink, Beam, Kafka Streams)
• Proficient in SQL and schema design for high-scale data environments
• Strong programming skills in Python, Java, or Scala
• Experience with cloud platforms such as AWS and infrastructure components like data lakes, warehouses, and feature stores
• Proven ability to support machine learning data workflows and pipelines
Preferred:
• Experience in building ML infrastructure for personalization or recommendation systems
• Familiarity with MLOps tools (MLflow, TFX, Kubeflow)
• Hands-on knowledge of real-time ML serving and online feature generation
• Prior work in early-stage or 0→1 product development environments
• Strong cross-functional collaboration with ML and product teams