Lead Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Lead Data Engineer with a contract length of "X months," offering a pay rate of "$X per hour." Key skills include Azure, Python, PySpark, and Apache Airflow. Experience in ETL/ELT pipelines and MLOps is required.
🌎 - Country
United States
💱 - Currency
$ USD
💰 - Day rate
Unknown
Unknown
🗓️ - Date discovered
April 22, 2025
🕒 - Project duration
Unknown
🏝️ - Location type
Unknown
📄 - Contract type
Unknown
🔒 - Security clearance
Unknown
📍 - Location detailed
Warren, NJ
🧠 - Skills detailed
#Azure DevOps #Apache Airflow #Airflow #Big Data #Terraform #Storage #Version Control #Databricks #Azure Databricks #Kubernetes #Cloud #PySpark #SQL (Structured Query Language) #Azure #ADF (Azure Data Factory) #Docker #Programming #ML (Machine Learning) #Spark (Apache Spark) #Data Science #Data Lake #GIT #Deployment #Microsoft Azure #Synapse #Azure Blob Storage #Azure Data Factory #DevOps #MLflow #"ETL (Extract #Transform #Load)" #GitHub #Python #Data Pipeline #Scala #Data Quality #Data Engineering
Role description

Key Responsibilities:

   • Design, develop, and maintain scalable and reliable ETL/ELT pipelines on Azure.

   • Work closely with data scientists, analysts, and other stakeholders to deliver end-to-end data solutions.

   • Build and optimize big data pipelines using PySpark and Azure Databricks.

   • Orchestrate data workflows using Apache Airflow or Azure Data Factory.

   • Implement and support MLOps frameworks and deployment pipelines.

   • Monitor, troubleshoot, and ensure data quality and reliability across systems.

   • Drive architectural decisions for data platform components, ensuring best practices are followed.

   • Mentor junior data engineers and contribute to knowledge sharing within the team.

Key Skills and Technologies:

   • Cloud Platforms: Microsoft Azure (Azure Data Lake, Azure Databricks, Azure Synapse, Azure Blob Storage)

   • Programming Languages: Python (strong), SQL

   • Big Data Technologies: PySpark, Spark

   • ETL Tools: Azure Data Factory, custom ETL development

   • Workflow Orchestration: Apache Airflow

   • DevOps/MLOps: Azure ML, MLflow, CI/CD pipelines for ML models

   • Version Control: Git, GitHub/Azure DevOps

   • Other Tools (Preferred): Terraform, Docker, Kubernetes