Vertisystem

ML Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for an ML Data Engineer with a 12-month contract, paying $70-$90/hr on W2. Required skills include Python, data engineering, and machine learning experience. Must have 5 years of experience and a relevant degree. On-site work is preferred.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
720
-
πŸ—“οΈ - Date
October 1, 2025
πŸ•’ - Duration
More than 6 months
-
🏝️ - Location
On-site
-
πŸ“„ - Contract
W2 Contractor
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
Redmond, WA
-
🧠 - Skills detailed
#Quality Assurance #Databases #Linux #Data Quality #NumPy #SQL (Structured Query Language) #Computer Science #Python #Data Science #REST (Representational State Transfer) #Libraries #Scala #Security #Anomaly Detection #PyTorch #Compliance #Data Cleaning #Datasets #Data Engineering #ML (Machine Learning) #Programming #"ETL (Extract #Transform #Load)" #Data Processing #Shell Scripting #Scripting #Pandas #Data Governance #SciPy #NoSQL #REST API
Role description
Duration: 12 months contact Pay rate: $70-$90/hr on W2 Job Summary: The organization brings together a world-class team of researchers, developers, and engineers to create the future of virtual and augmented reality, which together will become as universal and essential as smartphones and personal computers are today. In particular, the organization’s Research Audio team focuses on two goals: creating virtual sounds that are perceptually indistinguishable from reality and redefining human hearing. These two initiatives will allow us to connect with people by allowing them to feel together despite being physically apart and allowing them to converse in even the most difficult listening environments. We are looking for a contract ML Data Engineer to work at the intersection of data engineering and applied machine learning. You will be helping to process and transform our complex multimedia data into complete machine learning datasets suitable for consumption by researchers. We require someone who can take a hands-on approach to building data-processing pipelines, ensuring the data is robust and well-prepared for our machine learning workflows. Responsibilities: β€’ Design, develop, and maintain scalable data-processing pipelines for large volumes of multimedia (audio, video) and sensor data (e.g. IMU), ensuring reliability and reproducibility. β€’ Gather and interpret processing requirements from stakeholders, translating them into practical technical solutions and devising novel approaches where needed. β€’ Perform diverse data-processing operations, from mathematical transformations and filtering to feature extraction, synchronization, and inference through ML models. β€’ Interface with various internal tooling at the organization such as dataset management systems and training frameworks to prepare raw data for machine learning, including validation, transformation, and quality assurance. β€’ Collaborate with machine learning researchers to integrate research prototypes into production pipelines. β€’ Ensure compliance with data governance, security, and relevant standards. Minimum requirements β€’ Bachelor’s degree in a relevant technical field (e.g. Computer Science, Data Science) with 3+ years of industry experience in machine learning or data engineering; or equivalent combination of education and experience. β€’ Demonstrable programming experience in Python using common ML and data libraries, i.e. numpy, scipy, pandas. β€’ Proficiency in Linux and shell scripting. β€’ Working knowledge of audio, image and video formats. Preferred experience β€’ Experience using PyTorch or other Python machine-learning frameworks. β€’ Experience with relational and graph / NoSQL databases. β€’ Experience using REST APIs for data interactions. β€’ Experience working in a research environment. β€’ Strong mathematical background. Top 3 must-have HARD skills: β€’ Strong knowledge of Python in the context of data engineering and data processing (SQL, data cleaning, anomaly detection) β€’ Working understanding of ML training - specifically how data quality impacts ML training outcomes from PyTorch workflows Good to have skills: β€’ Knowledge of multimodal data sets (Audio/Video, Optitrack, multi-Camera/Sensor) β€’ Beginner knowledge of audio (DSP, acoustics) Years of experience needed: β€’ 5 Story Behind the Need – Business Group & Key Projects: β€’ This person will be working directly between Audio ML Researchers and Data Collection partners to ensure we have high quality datasets by which to train next generation of ML models and algorithms for future devices Is there anything we can share with Candidates to compel them to choose the organization over competitors? β€’ Working with highly capable PhDs who are experts in the field of Audio and Machine Learning β€’ Highly interesting, cutting-edge work β€’ On-site Cafe - free food! Typical Day in the Role: β€’ Meet with researchers/research assistants to understand issues/solutions β€’ Analyze and process data β€’ In-Office preferred, for hands on collaboration Resume Disqualifiers? β€’ We need someone who understands data and is willing to be hands-on. This is not a β€œsoftware engineer” role - candidate NEEDS to have real-world data processing experience How will performance be measured: β€’ We value people who can operate independently to drive results Interview Process/Format: β€’ Typically, 1 brief, 30 min screen followed by a 1.5-2 hr deeper dive into technical and behavioral specifics