

Senior ETL Pipeline Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior ETL Pipeline Engineer in Dallas, TX, on a 6-12 month contract with a pay rate of "X". Requires expertise in Python, Apache Spark, Docker, Kubernetes, and cloud-agnostic ETL pipeline development.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
May 23, 2025
π - Project duration
More than 6 months
-
ποΈ - Location type
On-site
-
π - Contract type
1099 Contractor
-
π - Security clearance
Unknown
-
π - Location detailed
Dallas, TX
-
π§ - Skills detailed
#"ETL (Extract #Transform #Load)" #REST API #Airflow #Python #Trino #Cloud #Data Lake #Docker #Storage #Hadoop #Infrastructure as Code (IaC) #Kubernetes #Automation #RDBMS (Relational Database Management System) #Presto #Terraform #NoSQL #S3 (Amazon Simple Storage Service) #GCP (Google Cloud Platform) #Azure #AWS (Amazon Web Services) #Databases #Spark (Apache Spark) #REST (Representational State Transfer) #Data Engineering #Data Pipeline #Deployment #Athena #Apache Airflow #Scala #Apache Spark #Data Storage #Datasets #Data Architecture
Role description
Job Title: Senior ETL Pipeline Engineer
Location: Dallas, TX
Duration: 6 12 Months Contract
Overview
Seeking an experienced Senior ETL Pipeline Engineer with strong expertise in building scalable, cloud-agnostic data pipelines using modern data engineering tools and platforms. This role involves end-to-end ownership of ETL development, from design through deployment, in a containerized and orchestrated environment. The ideal candidate is comfortable working across multi-cloud and hybrid infrastructures, integrating diverse data sources, and supporting long-term data initiatives.
Key Responsibilities
β’ Design, develop, and manage robust ETL pipelines using Python and Apache Spark to process large-scale datasets across structured, semi-structured, and unstructured formats.
β’ Containerize ETL workflows using Docker for portability and deploy them using Kubernetes for scalability and fault tolerance.
β’ Leverage Apache Airflow for orchestrating and scheduling complex data workflows.
β’ Build and maintain cloud-agnostic pipelines capable of running in multi-cloud or hybrid (cloud + on-premises) environments.
β’ Integrate data from a variety of sources, including Hadoop ecosystem, RDBMS, NoSQL databases, REST APIs, and third-party data providers.
β’ Work with data lake architectures and technologies such as Amazon S3, Trino, Presto, and Athena to support analytics and reporting use cases.
β’ Implement CI/CD practices to automate the deployment and update processes for ETL pipelines.
β’ Collaborate with cross-functional teams to align pipeline design with business and data architecture goals.
β’ Monitor pipeline health, performance, and cost-efficiency; troubleshoot and resolve issues proactively.
β’ Document pipeline architecture, operational playbooks, and best practices.
β’ (Preferred) Contribute to infrastructure automation using Infrastructure as Code (IaC) tools.
Requirements
Strong proficiency in Python and hands-on experience with Apache Spark for data transformation.
Deep understanding of Docker and Kubernetes for containerized deployments.
Experience with AWS Cloud and willingness to work across other cloud platforms as needed.
Solid experience with Apache Airflow or equivalent orchestration tools.
Demonstrated experience in developing cloud-agnostic ETL pipelines and operating in hybrid environments.
Familiarity with a variety of data storage and query tools including RDBMS, NoSQL, Hadoop-based systems, and cloud-native services.
Strong problem-solving skills, ownership mindset, and ability to execute on long-term, complex projects.
Preferred Qualifications
Experience in working with multi-cloud environments (AWS, Azure, GCP).
Exposure to Infrastructure as Code (e.g., Terraform, CloudFormation) for provisioning and managing environments.
Cloud certifications in data engineering, Kubernetes, or container orchestration.
Job Title: Senior ETL Pipeline Engineer
Location: Dallas, TX
Duration: 6 12 Months Contract
Overview
Seeking an experienced Senior ETL Pipeline Engineer with strong expertise in building scalable, cloud-agnostic data pipelines using modern data engineering tools and platforms. This role involves end-to-end ownership of ETL development, from design through deployment, in a containerized and orchestrated environment. The ideal candidate is comfortable working across multi-cloud and hybrid infrastructures, integrating diverse data sources, and supporting long-term data initiatives.
Key Responsibilities
β’ Design, develop, and manage robust ETL pipelines using Python and Apache Spark to process large-scale datasets across structured, semi-structured, and unstructured formats.
β’ Containerize ETL workflows using Docker for portability and deploy them using Kubernetes for scalability and fault tolerance.
β’ Leverage Apache Airflow for orchestrating and scheduling complex data workflows.
β’ Build and maintain cloud-agnostic pipelines capable of running in multi-cloud or hybrid (cloud + on-premises) environments.
β’ Integrate data from a variety of sources, including Hadoop ecosystem, RDBMS, NoSQL databases, REST APIs, and third-party data providers.
β’ Work with data lake architectures and technologies such as Amazon S3, Trino, Presto, and Athena to support analytics and reporting use cases.
β’ Implement CI/CD practices to automate the deployment and update processes for ETL pipelines.
β’ Collaborate with cross-functional teams to align pipeline design with business and data architecture goals.
β’ Monitor pipeline health, performance, and cost-efficiency; troubleshoot and resolve issues proactively.
β’ Document pipeline architecture, operational playbooks, and best practices.
β’ (Preferred) Contribute to infrastructure automation using Infrastructure as Code (IaC) tools.
Requirements
Strong proficiency in Python and hands-on experience with Apache Spark for data transformation.
Deep understanding of Docker and Kubernetes for containerized deployments.
Experience with AWS Cloud and willingness to work across other cloud platforms as needed.
Solid experience with Apache Airflow or equivalent orchestration tools.
Demonstrated experience in developing cloud-agnostic ETL pipelines and operating in hybrid environments.
Familiarity with a variety of data storage and query tools including RDBMS, NoSQL, Hadoop-based systems, and cloud-native services.
Strong problem-solving skills, ownership mindset, and ability to execute on long-term, complex projects.
Preferred Qualifications
Experience in working with multi-cloud environments (AWS, Azure, GCP).
Exposure to Infrastructure as Code (e.g., Terraform, CloudFormation) for provisioning and managing environments.
Cloud certifications in data engineering, Kubernetes, or container orchestration.