

ML Data Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for an ML Data Engineer on a 7-month contract in Cupertino, CA, offering $45.00 - $60.00 per hour. Key skills include audio signal processing, Python, cloud infrastructure, and experience with large-scale data pipelines and machine learning workflows.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
July 12, 2025
π - Project duration
More than 6 months
-
ποΈ - Location type
On-site
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
Cupertino, CA
-
π§ - Skills detailed
#Databases #Data Storage #Data Quality #ML (Machine Learning) #Classification #Spark (Apache Spark) #Python #MLflow #Computer Science #Visualization #Signal Processing #Batch #Scala #Apache Beam #PyTorch #SciPy #Datasets #Data Management #GCP (Google Cloud Platform) #Data Engineering #Data Pipeline #Airflow #Cloud #Deployment #AWS (Amazon Web Services) #Deep Learning #Azure #"ETL (Extract #Transform #Load)" #TensorFlow #Storage #Libraries
Role description
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
ML Data Engineer
A globally leading technology company is looking for an ML Data Engineer to design and optimize data pipelines for large-scale audio and acoustic machine learning workflows. In this role, youβll collaborate with researchers and signal processing experts to deliver high-quality, scalable datasets that power state-of-the-art models. If you're passionate about machine learning infrastructure, audio data, and real-world impact, weβd love to hear from you.
Key Responsibilities:
β’ Design, build, and maintain scalable and efficient data pipelines for processing large-scale audio and acoustic datasets.
β’ Collaborate with ML researchers and acoustic scientists to collect, annotate, transform, and curate high-quality training and evaluation datasets.
β’ Implement signal processing algorithms for feature extraction
β’ Work on real-time and batch processing frameworks for streaming and static audio data.
β’ Support model training and evaluation through optimized data loaders and preprocessing steps.
β’ Ensure data quality, versioning, and reproducibility using best practices in data engineering.
β’ Deploy and maintain cloud-based infrastructure for data workflows (e.g., AWS, GCP, Azure).
β’ Develop tools for data visualization and annotation specific to acoustic events.
Required Qualifications:
β’ Bachelor's or Master's degree in Computer Science, Electrical Engineering, Acoustics, or a related field.
β’ Strong experience with audio signal processing libraries (e.g., Librosa, PyDub, SciPy, torchaudio).
β’ Proficient in Python and relevant data engineering frameworks (e.g., Airflow, Apache Beam, Spark).
β’ Experience working with large-scale data pipelines and cloud infrastructure.
β’ Familiarity with machine learning workflows, especially in audio or time-series domains.
β’ Understanding of acoustic features and formats (e.g., WAV, FLAC, sampling rates).
β’ Strong knowledge of databases, data storage formats (e.g., Parquet, HDF5), and data management tools.
β’ Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) for audio modeling.
β’ Knowledge of acoustic modeling, speech recognition, or sound classification.
β’ Experience with edge deployment and real-time audio processing.
β’ Familiarity with tools like Weights & Biases, MLflow, or DVC for ML operations.
Type: Contract
Duration: 7 months (with a possibility to extend to 18 months)
Work Location: Cupertino, CA (100% On site)
Pay Rate: $ 45.00 - $ 60.00 (DOE)
No C2C or third party agencies