

Net2Source Inc.
Senior Data Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Engineer in Reading, PA, on a contract basis. Pay rate is unspecified. Key skills include PySpark, Python Microservices, AWS, and Apache Iceberg. Requires a degree in Computer Science or related field and experience with data pipelines.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
560
-
ποΈ - Date
April 7, 2026
π - Duration
Unknown
-
ποΈ - Location
On-site
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
Reading, PA
-
π§ - Skills detailed
#Kafka (Apache Kafka) #Programming #PostgreSQL #Data Modeling #Spark SQL #MySQL #Data Engineering #Aurora #Computer Science #Data Lake #Redshift #Lambda (AWS Lambda) #Data Processing #Batch #Data Ingestion #Logging #Monitoring #Spark (Apache Spark) #Storage #Scala #Code Reviews #Microservices #Data Pipeline #Apache Iceberg #Data Architecture #Athena #S3 (Amazon Simple Storage Service) #"ETL (Extract #Transform #Load)" #Data Integration #AWS (Amazon Web Services) #Cloud #Python #Data Transformations #PySpark #SQL (Structured Query Language)
Role description
Role: Senior Data Engineer
Location: Reading, PA
Term: Contract
We are seeking a highly skilled Senior Data Engineer to design, build, and operate scalable, high-performance data platforms. This is a hands-on engineering role requiring deep expertise in PySpark, Python Microservices, and Python programming, along with modern data lake technologies such as Apache Iceberg. The ideal candidate will work closely with data architects and platform leads to implement reliable batch and streaming data pipelines on AWS that support analytics and business-critical applications.
Key Responsibilities
β’ Hands-On Data Engineering
β’ Design, develop, and maintain large-scale batch and streaming data pipelines using PySpark and .
β’ Write production-grade Python code for complex data transformations, validations, and business logic.
β’ Implement efficient processing of high-volume, high-velocity data across distributed systems.
β’ Streaming & Real-Time Processing
β’ Build and operate real-time and near real-time data pipelines using .
β’ Implement stateful processing, windowing, checkpointing, and fault-tolerant streaming applications.
β’ Ensure low-latency and high-throughput streaming solutions.
β’ Data Lake & Iceberg
β’ Design and manage data lake architectures using Apache Iceberg on cloud storage (S3).
β’ Implement Iceberg capabilities such as schema evolution, partitioning, compaction, and time travel.
β’ Optimize read and write performance for large Iceberg tables.
β’ Cloud & Data Integration
β’ Design, develop, and deploy data pipelines on AWS using services such as S3, EMR, Glue, Lambda, Athena, Redshift, and Aurora (MySQL/PostgreSQL) for data ingestion, processing, and analytics.
β’ Performance, Reliability & Operations
β’ Tune Spark jobs for performance, scalability, and cost efficiency.
β’ Troubleshoot and resolve complex production issues in distributed data systems.
β’ Implement monitoring, alerting, logging, and recovery strategies for data pipelines.
β’ Engineering Excellence & Collaboration
β’ Write clean, testable, and maintainable code following engineering best practices.
β’ Contribute to CI/CD pipelines for data engineering workloads.
β’ Participate in code reviews, technical design discussions, and architecture reviews.
Required Qualifications
β’ Bachelorβs or Masterβs degree in Computer Science, Engineering, or a related field.
β’ Strong expertise in PySpark and Microservices and Spark SQL for large-scale data processing
β’ Strong expertise in , including streaming fundamentals and stateful processing.
β’ Hands-on experience building and running Kafka-based streaming applications in production environments.
β’ Advanced proficiency in Python for building scalable, production-grade data solutions.
β’ Hands-on experience with Apache Iceberg in production environments.
β’ Solid experience with AWS data services (S3, EMR, Glue, Lambda, Redshift).
β’ Advanced SQL skills and strong understanding of data modeling and data lake architectures.
Role: Senior Data Engineer
Location: Reading, PA
Term: Contract
We are seeking a highly skilled Senior Data Engineer to design, build, and operate scalable, high-performance data platforms. This is a hands-on engineering role requiring deep expertise in PySpark, Python Microservices, and Python programming, along with modern data lake technologies such as Apache Iceberg. The ideal candidate will work closely with data architects and platform leads to implement reliable batch and streaming data pipelines on AWS that support analytics and business-critical applications.
Key Responsibilities
β’ Hands-On Data Engineering
β’ Design, develop, and maintain large-scale batch and streaming data pipelines using PySpark and .
β’ Write production-grade Python code for complex data transformations, validations, and business logic.
β’ Implement efficient processing of high-volume, high-velocity data across distributed systems.
β’ Streaming & Real-Time Processing
β’ Build and operate real-time and near real-time data pipelines using .
β’ Implement stateful processing, windowing, checkpointing, and fault-tolerant streaming applications.
β’ Ensure low-latency and high-throughput streaming solutions.
β’ Data Lake & Iceberg
β’ Design and manage data lake architectures using Apache Iceberg on cloud storage (S3).
β’ Implement Iceberg capabilities such as schema evolution, partitioning, compaction, and time travel.
β’ Optimize read and write performance for large Iceberg tables.
β’ Cloud & Data Integration
β’ Design, develop, and deploy data pipelines on AWS using services such as S3, EMR, Glue, Lambda, Athena, Redshift, and Aurora (MySQL/PostgreSQL) for data ingestion, processing, and analytics.
β’ Performance, Reliability & Operations
β’ Tune Spark jobs for performance, scalability, and cost efficiency.
β’ Troubleshoot and resolve complex production issues in distributed data systems.
β’ Implement monitoring, alerting, logging, and recovery strategies for data pipelines.
β’ Engineering Excellence & Collaboration
β’ Write clean, testable, and maintainable code following engineering best practices.
β’ Contribute to CI/CD pipelines for data engineering workloads.
β’ Participate in code reviews, technical design discussions, and architecture reviews.
Required Qualifications
β’ Bachelorβs or Masterβs degree in Computer Science, Engineering, or a related field.
β’ Strong expertise in PySpark and Microservices and Spark SQL for large-scale data processing
β’ Strong expertise in , including streaming fundamentals and stateful processing.
β’ Hands-on experience building and running Kafka-based streaming applications in production environments.
β’ Advanced proficiency in Python for building scalable, production-grade data solutions.
β’ Hands-on experience with Apache Iceberg in production environments.
β’ Solid experience with AWS data services (S3, EMR, Glue, Lambda, Redshift).
β’ Advanced SQL skills and strong understanding of data modeling and data lake architectures.






