Motion Recruitment

Data Engineer- Python, AI/ML

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with a contract length of "unknown" and a pay rate of "$/hour". Key skills include Python, SQL, Apache Spark, and cloud platforms (AWS, Azure, GCP). Requires 2+ years of pipeline experience and a relevant degree.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
480
-
🗓️ - Date
May 21, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Troy, MI
-
🧠 - Skills detailed
#Compliance #Libraries #Apache Spark #MySQL #Cloud #Computer Science #BI (Business Intelligence) #"ETL (Extract #Transform #Load)" #Classification #Tableau #AWS (Amazon Web Services) #Schema Design #Data Science #GCP (Google Cloud Platform) #SciPy #Data Quality #Data Pipeline #AI (Artificial Intelligence) #Pandas #PySpark #Microsoft Power BI #Datasets #Spark (Apache Spark) #Python #Statistics #Automated Testing #Looker #Azure #Data Engineering #SQL (Structured Query Language) #Data Lineage #PostgreSQL #Apache Airflow #Databases #ML (Machine Learning) #Anomaly Detection #SQL Server #GIT #Metadata #Airflow #NumPy
Role description
Key Responsibilities • Build and maintain Python and SQL pipelines for governance-related ingestion, cleaning, transformation, and validation of structured and semi-structured data. • Implement and operate data quality checks, schema validation, and integrity rules across pipelines; investigate and resolve quality issues. • Contribute to master data workflows: standardization, deduplication, and consolidation of data from heterogeneous sources into consistent reference and golden-record datasets. • Instrument pipelines for data lineage, metadata, and catalog tooling. • Develop pipelines that feed governance dashboards and reporting in Tableau, Power BI, or Looker. • Build reproducible, well-documented pipelines for compliance and audit reporting. • Contribute to AI / ML-assisted governance use cases: embedding-based data classification, anomaly detection on quality metrics, LLM-assisted catalog search, and MCP-based exposure of governed datasets to AI assistants. • Partner with team leads, data stewards, and stakeholders to translate governance requirements into engineering work. • Follow team engineering practices: Git, code review, modular pipeline design, automated testing, CI/CD. Required Qualifications • Bachelor's or Master's degree in Computer Science, Data Science, Engineering, Statistics, or a related field. • 2+ years building data pipelines in Python (Pandas, NumPy, SciPy) and SQL. • Working experience with Apache Spark or PySpark and workflow orchestration (Apache Airflow). • Schema design across relational (PostgreSQL, MySQL, SQL Server) and analytical databases, including standardization across heterogeneous sources. • Experience implementing data quality validation, EDA, and integrity enforcement on production datasets. • Hands-on experience with at least one major cloud platform (AWS, Azure, or GCP). • Working familiarity with Python ML libraries (Scikit-Learn) for feature engineering and exploratory analysis. • Experience producing analytics-ready datasets for BI tools (Tableau, Power BI, or Looker). • Git, code review, and CI/CD practices. • Clear technical communication and collaborative working style.