

Cliff Services Inc
W2 Only--Sr Data Engineer with Python--F2F Interview
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is a W2-only Senior Data Engineer position requiring Python, PySpark, and AWS expertise. It offers a hybrid work model (3-4 days onsite) in McLean VA, Richmond VA, or Dallas TX, with a competitive pay rate.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
December 31, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Hybrid
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
Dallas, TX
-
🧠 - Skills detailed
#AWS (Amazon Web Services) #Pandas #Data Storage #Data Transformations #Kafka (Apache Kafka) #Spark (Apache Spark) #Data Science #Python #Data Ingestion #DynamoDB #Athena #Code Reviews #GIT #Data Architecture #"ETL (Extract #Transform #Load)" #Data Processing #S3 (Amazon Simple Storage Service) #Data Pipeline #Data Quality #Libraries #Lambda (AWS Lambda) #Security #Version Control #Data Engineering #Apache Airflow #Documentation #Cloud #Snowflake #Apache Spark #Redshift #Airflow #Data Governance #Data Modeling #Scala #SQL (Structured Query Language) #Storage #PySpark #Schema Design #Programming #BI (Business Intelligence)
Role description
Job Title: Data Engineers
Type: Onsite (Hybrid 3 to 4 days to office)
Interview: In Person
Locations: McLean VA, Richmond VA or Dallas TX
Job Description:
A Data Engineer with Python, PySpark, and AWS expertise is responsible for designing, building, and maintaining scalable and efficient data pipelines in cloud environment
Responsibilities:
Design, develop, and maintain robust ETL/ELT pipelines using Python and PySpark for data ingestion, transformation, and processing.
Work extensively with AWS cloud services such such as S3, Glue, EMR, Lambda, Redshift, Athena, and DynamoDB for data storage, processing, and warehousing.
Build and optimize data ingestion and processing frameworks for large-scale data sets, ensuring data quality, consistency, and accuracy.
Collaborate with data architects, data scientists, and business intelligence teams to understand data requirements and deliver effective data solutions.
Implement data governance, lineage, and security best practices within data pipelines and infrastructure.
Automate data workflows and improve data pipeline performance through optimization and tuning.
Develop and maintain documentation for data solutions, including data dictionaries, lineage, and technical specifications.
Participate in code reviews, contribute to continuous improvement initiatives, and troubleshoot complex data and pipeline issues
Required Skills:
Strong programming proficiency in Python, including libraries like Pandas and extensive experience with PySpark for distributed data processing.
Solid understanding and practical experience with Apache Spark/PySpark for large-scale data transformations.
Demonstrated experience with AWS data services, including S3, Glue, EMR, Lambda, Redshift, and Athena.
Proficiency in SQL and a strong understanding of data modeling, schema design, and data warehousing concepts.
Experience with workflow orchestration tools such as Apache Airflow or AWS Step Functions.
Familiarity with CI/CD pipelines and version control systems (e.g., Git).
Excellent problem-solving, analytical, and communication skills, with the ability to work effectively in a team environment.
Preferred Skills:
Experience with streaming frameworks like Kafka or Kinesis.
Knowledge of other data warehousing solutions like Snowflake
Thanks & regards,
K Hemanth | Recruitment Specialist
Email: hemanth.k@cliff-services.com
Job Title: Data Engineers
Type: Onsite (Hybrid 3 to 4 days to office)
Interview: In Person
Locations: McLean VA, Richmond VA or Dallas TX
Job Description:
A Data Engineer with Python, PySpark, and AWS expertise is responsible for designing, building, and maintaining scalable and efficient data pipelines in cloud environment
Responsibilities:
Design, develop, and maintain robust ETL/ELT pipelines using Python and PySpark for data ingestion, transformation, and processing.
Work extensively with AWS cloud services such such as S3, Glue, EMR, Lambda, Redshift, Athena, and DynamoDB for data storage, processing, and warehousing.
Build and optimize data ingestion and processing frameworks for large-scale data sets, ensuring data quality, consistency, and accuracy.
Collaborate with data architects, data scientists, and business intelligence teams to understand data requirements and deliver effective data solutions.
Implement data governance, lineage, and security best practices within data pipelines and infrastructure.
Automate data workflows and improve data pipeline performance through optimization and tuning.
Develop and maintain documentation for data solutions, including data dictionaries, lineage, and technical specifications.
Participate in code reviews, contribute to continuous improvement initiatives, and troubleshoot complex data and pipeline issues
Required Skills:
Strong programming proficiency in Python, including libraries like Pandas and extensive experience with PySpark for distributed data processing.
Solid understanding and practical experience with Apache Spark/PySpark for large-scale data transformations.
Demonstrated experience with AWS data services, including S3, Glue, EMR, Lambda, Redshift, and Athena.
Proficiency in SQL and a strong understanding of data modeling, schema design, and data warehousing concepts.
Experience with workflow orchestration tools such as Apache Airflow or AWS Step Functions.
Familiarity with CI/CD pipelines and version control systems (e.g., Git).
Excellent problem-solving, analytical, and communication skills, with the ability to work effectively in a team environment.
Preferred Skills:
Experience with streaming frameworks like Kafka or Kinesis.
Knowledge of other data warehousing solutions like Snowflake
Thanks & regards,
K Hemanth | Recruitment Specialist
Email: hemanth.k@cliff-services.com





