

STAFFXPERT LLC
Lead Data Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Lead Data Engineer on a long-term remote contract, offering a competitive pay rate. Required skills include 8–10+ years in data engineering, proficiency in Python and SQL, and experience with cloud services like AWS and Azure.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
June 17, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Data Ingestion #Monitoring #Scala #SQL (Structured Query Language) #ChatGPT #AI (Artificial Intelligence) #Azure #DevOps #Azure Data Factory #Python #Logging #Documentation #ADLS (Azure Data Lake Storage) #Data Layers #Data Quality #S3 (Amazon Simple Storage Service) #Snowflake #GIT #Redshift #AWS (Amazon Web Services) #AWS Glue #Airflow #Data Engineering #Version Control #Data Science #Debugging #dbt (data build tool) #Indexing #"ETL (Extract #Transform #Load)" #Data Processing #Azure ADLS (Azure Data Lake Storage) #Spark (Apache Spark) #Cloud #Data Pipeline #Automation #ADF (Azure Data Factory) #Batch #AWS S3 (Amazon Simple Storage Service) #Synapse
Role description
Role: Lead Data Engineer
Location: Remote
Duration: Long Term
support experience Mandatory
Key responsibilitie
s
• Serve as L3 support: triage high-severity incidents, perform advanced debugging/root-cause analysis, deploy hotfixes, and create runbooks for L2 teams
.
• Build and maintain batch/streaming data pipelines using ETL/ELT tools (dbt,) to integrate and transform multi-source data
.
• Implement data quality validation, monitoring, alerting, and documentation; optimize pipelines for performance, cost, and reliability (partitioning, indexing, error handling)
.
• Partner with analytics, data science, and business teams to deliver data requirements, troubleshoot issues, and ensure SLAs for freshness/completeness
.
Required qualificatio
ns
• 8–10+ years data engineering experience building and supporting production pipelines at scal
e.
• Design, build, and maintain data ingestion, transformation, and delivery pipelines across structured and semi-structured data source
s.
• Develop modular, reusable data transformation logic using Python, SQL, and frameworks such as db
t.
• Implement data models and schemas optimized for analytics and reporting (star, snowflake, or dimensional
).
• Apply Medallion Architecture principles to organize data layers for quality, traceability, and performanc
e.
• Use cloud-native data services such as AWS Glue, S3, Redshift, EMR or Azure Data Factory, ADLS, Synapse to manage data workflow
s.
• Set up and manage data pipeline orchestration, scheduling, and monitoring using Airflow, ADF, or equivalent tool
s.
• Apply data quality checks, validation logic, and logging mechanisms to ensure consistency and trust in data asset
s.
• Collaborate with analysts, scientists, and architects to design data models that align with business and analytical need
s.
• Maintain code versioning, testing, and CI/CD standards for data pipeline developmen
t.
• Proven cloud data platform + orchestration experience (Snowflake/Big Query + Airflow/dbt
).
• L3 support experience: incident management, on-call rotations, debugging distributed data system
s.
Core Competencies & Ski
lls
• Strong understanding of data engineering fundamentals — ETL/ELT design, data modelling, schema evolution, and data integri
ty.
• Proficient in Python and SQL for data transformation, automation, and workflow scripti
ng.
• Hands-on experience with cloud-based data services in AWS (S3, Glue, Redshift, EMR) or Azure (ADLS, ADF, Synaps
e).
• Working knowledge of distributed data processing concepts (Spark, Hive, or equivalen
t).
• Familiarity with dbt for transformation design, testing, and data documentati
on.
• Awareness of Medallion Architecture and data layering concepts for scalable data manageme
nt.
• Understanding of orchestration frameworks like Airflow or Data Factory for scheduling and monitoring pipelin
es.
• Knowledge of Git-based version control, CI/CD, and basic DevOps practices in data workflo
ws.
• Have an AI skill set, a little bit on Claude, ChatGPT, and other tool supports, or at least who can pick up those skil
ls.
Role: Lead Data Engineer
Location: Remote
Duration: Long Term
support experience Mandatory
Key responsibilitie
s
• Serve as L3 support: triage high-severity incidents, perform advanced debugging/root-cause analysis, deploy hotfixes, and create runbooks for L2 teams
.
• Build and maintain batch/streaming data pipelines using ETL/ELT tools (dbt,) to integrate and transform multi-source data
.
• Implement data quality validation, monitoring, alerting, and documentation; optimize pipelines for performance, cost, and reliability (partitioning, indexing, error handling)
.
• Partner with analytics, data science, and business teams to deliver data requirements, troubleshoot issues, and ensure SLAs for freshness/completeness
.
Required qualificatio
ns
• 8–10+ years data engineering experience building and supporting production pipelines at scal
e.
• Design, build, and maintain data ingestion, transformation, and delivery pipelines across structured and semi-structured data source
s.
• Develop modular, reusable data transformation logic using Python, SQL, and frameworks such as db
t.
• Implement data models and schemas optimized for analytics and reporting (star, snowflake, or dimensional
).
• Apply Medallion Architecture principles to organize data layers for quality, traceability, and performanc
e.
• Use cloud-native data services such as AWS Glue, S3, Redshift, EMR or Azure Data Factory, ADLS, Synapse to manage data workflow
s.
• Set up and manage data pipeline orchestration, scheduling, and monitoring using Airflow, ADF, or equivalent tool
s.
• Apply data quality checks, validation logic, and logging mechanisms to ensure consistency and trust in data asset
s.
• Collaborate with analysts, scientists, and architects to design data models that align with business and analytical need
s.
• Maintain code versioning, testing, and CI/CD standards for data pipeline developmen
t.
• Proven cloud data platform + orchestration experience (Snowflake/Big Query + Airflow/dbt
).
• L3 support experience: incident management, on-call rotations, debugging distributed data system
s.
Core Competencies & Ski
lls
• Strong understanding of data engineering fundamentals — ETL/ELT design, data modelling, schema evolution, and data integri
ty.
• Proficient in Python and SQL for data transformation, automation, and workflow scripti
ng.
• Hands-on experience with cloud-based data services in AWS (S3, Glue, Redshift, EMR) or Azure (ADLS, ADF, Synaps
e).
• Working knowledge of distributed data processing concepts (Spark, Hive, or equivalen
t).
• Familiarity with dbt for transformation design, testing, and data documentati
on.
• Awareness of Medallion Architecture and data layering concepts for scalable data manageme
nt.
• Understanding of orchestration frameworks like Airflow or Data Factory for scheduling and monitoring pipelin
es.
• Knowledge of Git-based version control, CI/CD, and basic DevOps practices in data workflo
ws.
• Have an AI skill set, a little bit on Claude, ChatGPT, and other tool supports, or at least who can pick up those skil
ls.






