

TalentBurst, an Inc 5000 company
Machine Learning Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Machine Learning Data Engineer, lasting 12+ months, based in Seattle, WA (3 days onsite). Requires a Bachelor's in Computer Science, 3+ years of experience, proficiency in Python, and knowledge of data processing tools and cloud systems.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
624
-
ποΈ - Date
December 25, 2025
π - Duration
More than 6 months
-
ποΈ - Location
Hybrid
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
Seattle, WA
-
π§ - Skills detailed
#PyTorch #Computer Science #Data Ingestion #Documentation #S3 (Amazon Simple Storage Service) #NLP (Natural Language Processing) #AWS S3 (Amazon Simple Storage Service) #Data Warehouse #SQL (Structured Query Language) #Data Engineering #Pandas #Datasets #AWS (Amazon Web Services) #ML (Machine Learning) #Cloud #Data Science #Scripting #Data Quality #Data Pipeline #"ETL (Extract #Transform #Load)" #Shell Scripting #Scala #Data Processing #Monitoring #Python
Role description
Job Title: Machine Learning Data Engineer
Duration β 12+ months
Location β Seattle WA - 3 days onsite (Tuesday, Wednesday, Thursday)
----------------------------------------------------
Job Overview:
We are seeking a detail-oriented Machine Learning Data Engineer to join our team. As an ML Data Engineer, you will be responsible for designing, building, and maintaining scalable data pipelines that ingest, transform, and load data from various sources into our cloud-based systems. You will work closely with machine learning teams to ensure that data is accurate, enriched, reliable, and readily available for analytics and model training.
Responsibilities:
β’ Design and Build Data Pipelines: Create efficient, reliable, streamable, and scalable data pipelines using industry-standard tools and techniques, such as TorchData, WebDataset, Apache Parquet., Python, and SQL.
β’ Data Ingestion: Develop strategies for ingesting data from data providers, ensuring data quality and consistency.
β’ Data Pre-processing: Implement parallel pre-processing to clean, transform, de-duplicate, combine and normalize data.
β’ Data Curation and Enrichment: Curate, augment, and enrich existing datasets to improve data quality and provide valuable insights to stakeholders.
β’ Synthetic Data Generation: Collaborate with synthetic data teams to generate data and incorporate into existing pipelines.
β’ Collaboration with ML Teams: Work closely with ML scientists, engineers, and product teams to understand data requirements, and collaborate on data delivery.
β’ Monitoring, Maintenance & Updating: Monitor data pipelines for performance, errors, and bottlenecks, and implement regular maintenance and updates. Stay updated with the latest trends and incorporate best practices into data pipelines.
β’ Technical Documentation: Document data pipelines, settings, and procedures for easy maintenance and knowledge sharing.
Minimum Qualifications:
β’ Bachelorβs degree in Computer Science, Information Technology, or a related field.
β’ At least 3 years of experience as a Software Engineer or Data Engineer.
β’ Strong software engineering skills, proficiency in Python
β’ Experience with data processing tools and formats such as Apache Parquet, WebDataset, TorchData, Pandas, Shell Scripting, Protobuf, TFRecord
β’ Knowledge of data warehouse architectures and cloud-based systems (e.g., AWS S3).
β’ Strong problem-solving and analytical skills.
β’ Excellent communication and collaboration skills.
β’ Preferred Qualifications:
β’ Masterβs degree in Data Science or a related field.
β’ Experience with data curation and enrichment techniques, particularly for large scale text, image and video data
β’ Familiarity with natural language processing (NLP), machine learning (ML) concepts and frameworks (PyTorch)
β’ Location:
β’ Seattle, WA (in-person 3 days/wk, remote 2 days/wk)
Job Title: Machine Learning Data Engineer
Duration β 12+ months
Location β Seattle WA - 3 days onsite (Tuesday, Wednesday, Thursday)
----------------------------------------------------
Job Overview:
We are seeking a detail-oriented Machine Learning Data Engineer to join our team. As an ML Data Engineer, you will be responsible for designing, building, and maintaining scalable data pipelines that ingest, transform, and load data from various sources into our cloud-based systems. You will work closely with machine learning teams to ensure that data is accurate, enriched, reliable, and readily available for analytics and model training.
Responsibilities:
β’ Design and Build Data Pipelines: Create efficient, reliable, streamable, and scalable data pipelines using industry-standard tools and techniques, such as TorchData, WebDataset, Apache Parquet., Python, and SQL.
β’ Data Ingestion: Develop strategies for ingesting data from data providers, ensuring data quality and consistency.
β’ Data Pre-processing: Implement parallel pre-processing to clean, transform, de-duplicate, combine and normalize data.
β’ Data Curation and Enrichment: Curate, augment, and enrich existing datasets to improve data quality and provide valuable insights to stakeholders.
β’ Synthetic Data Generation: Collaborate with synthetic data teams to generate data and incorporate into existing pipelines.
β’ Collaboration with ML Teams: Work closely with ML scientists, engineers, and product teams to understand data requirements, and collaborate on data delivery.
β’ Monitoring, Maintenance & Updating: Monitor data pipelines for performance, errors, and bottlenecks, and implement regular maintenance and updates. Stay updated with the latest trends and incorporate best practices into data pipelines.
β’ Technical Documentation: Document data pipelines, settings, and procedures for easy maintenance and knowledge sharing.
Minimum Qualifications:
β’ Bachelorβs degree in Computer Science, Information Technology, or a related field.
β’ At least 3 years of experience as a Software Engineer or Data Engineer.
β’ Strong software engineering skills, proficiency in Python
β’ Experience with data processing tools and formats such as Apache Parquet, WebDataset, TorchData, Pandas, Shell Scripting, Protobuf, TFRecord
β’ Knowledge of data warehouse architectures and cloud-based systems (e.g., AWS S3).
β’ Strong problem-solving and analytical skills.
β’ Excellent communication and collaboration skills.
β’ Preferred Qualifications:
β’ Masterβs degree in Data Science or a related field.
β’ Experience with data curation and enrichment techniques, particularly for large scale text, image and video data
β’ Familiarity with natural language processing (NLP), machine learning (ML) concepts and frameworks (PyTorch)
β’ Location:
β’ Seattle, WA (in-person 3 days/wk, remote 2 days/wk)






