Centraprise

Sr. Data Engineer (ETL, PySpark, AWS)

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Sr. Data Engineer (ETL, PySpark, AWS) on a long-term remote contract, offering competitive pay. Key skills include Python, PySpark, AWS Glue, and healthcare data experience. Strong knowledge of ETL/ELT pipeline design and CI/CD is required.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

October 16, 2025

🕒 - Duration

Unknown

🏝️ - Location

Remote

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

United States

🧠 - Skills detailed

#Data Lake #Linux #AWS Glue #Scala #Data Modeling #Datasets #DataOps #Apache Spark #Cloud #Data Pipeline #Spark (Apache Spark) #Data Quality #Delta Lake #Data Lakehouse #Python #Batch #IAM (Identity and Access Management) #Kafka (Apache Kafka) #Version Control #Unix #Terraform #DevOps #Data Engineering #"ETL (Extract #Transform #Load)" #Infrastructure as Code (IaC) #Lambda (AWS Lambda) #Storage #S3 (Amazon Simple Storage Service) #PySpark #Airflow #SQL (Structured Query Language) #AWS (Amazon Web Services) #Apache Airflow #Data Processing #Automation #GitHub #GIT #Normalization

Role description

Sr. Data Engineer (ETL, PySpark, AWS) Remote Long term Job Description: • We are looking for a Senior Data Engineer to design, build, and optimize large-scale data processing systems supporting healthcare analytics and operational reporting. • This role will involve working closely with DataOps, DevOps, and QA teams to enable scalable and reliable data pipelines. Key Responsibilities: • Design and implement ETL/ELT pipelines using Python and PySpark. • Develop scalable data workflows using Apache Spark and AWS Glue. • Collaborate with QA and DevOps to integrate CI/CD and testing automation. • Manage data lake structures and ensure data quality, lineage, and auditability. • Optimize and monitor performance of batch and streaming pipelines. • Build infrastructure as code (IaC) using tools like Terraform, GitHub Actions. • Work across structured, semi-structured, and unstructured healthcare datasets. Required Technical Skills: Core & Deep Knowledge Assessment: • Python • PySpark • SQL (including Window functions and CASE) • AWS Glue, S3, Lambda • Apache Spark • Apache Airflow • Delta Lake/Data Lakehouse Architecture • CI/CD (Terraform, GitHub Actions) • ETL/ELT pipeline design and optimization Basic Overall Knowledge Assessment: • Kafka • Data modeling and normalization • Unix/Linux • Infrastructure as Code (IaC) • Cloud storage, IAM, and networking fundamentals (AWS) • Git version control • Healthcare data domain knowledge

Apply now Apply with DFH Sign up

Centraprise

Sr. Data Engineer (ETL, PySpark, AWS)

Data Science Lead Trainer/Instructor

Data Compliance Engineer

Oracle EBS Procure to Pay (P2P) - Business Analyst - Remote

Data and Analytics Consultant (NSAW Track)

Book a

chat

with us

Company