

VARSA SOLUTIONS
Data Engineer (Modern Stack — Python/PySpark/AWS)
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer (Modern Stack — Python/PySpark/AWS) with a contract length of "unknown," offering a pay rate of "unknown." Key skills required include 5+ years of experience in Data Engineering, expertise in Python, PySpark, and AWS.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
March 16, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Deployment #Jenkins #Data Analysis #PySpark #SQL (Structured Query Language) #Data Processing #Shell Scripting #Snowflake #Pandas #Terraform #Scala #Data Quality #Docker #GIT #Data Engineering #S3 (Amazon Simple Storage Service) #Distributed Computing #Databases #Data Science #Datasets #Data Extraction #Python #DevOps #Spark (Apache Spark) #Lambda (AWS Lambda) #NoSQL #SQL Queries #Scripting #AWS (Amazon Web Services) #Kubernetes #Cloud #Libraries #"ETL (Extract #Transform #Load)" #Data Pipeline #Data Transformations #HBase #Microservices #DynamoDB #Apache Spark #Hadoop #NumPy #Artifactory
Role description
Job Description
We are seeking a highly motivated and experienced Data Engineer to join our team, focusing on building, optimizing, and deploying robust, scalable data solutions. The ideal candidate will possess deep expertise in Python and PySpark to drive complex data transformations and support high-volume, performance-critical data initiatives.
Key Responsibilities
• Design, build, and maintain high-performance ETL/ELT data pipelines using Python and PySpark
• Apply expertise in Python data analysis libraries including Pandas and NumPy
• Develop and manage data processing jobs leveraging PySpark for distributed computing across large-scale datasets
• Implement DevOps practices and tooling for automated deployment and orchestration of Python applications
• Utilize AWS or other cloud services to architect and maintain cloud-based data ecosystems
• Write and optimize complex SQL queries for data extraction, integrity checks, and performance tuning
• Collaborate with data scientists and analysts to ensure data quality, availability, and consistency
Required Technical Skills
• 5+ years of experience in Data Engineering
• Expert-level proficiency in Python — Pandas, NumPy
• Solid hands-on experience with PySpark for building scalable data workflows
• AWS — S3, Glue, EMR, Snowflake, Lambda strongly preferred
• Docker, Kubernetes, Terraform, CloudFormation
• Advanced SQL and data warehousing knowledge
• Apache Spark, Hadoop ecosystem
• CI/CD tools — Jenkins, Git, Artifactory/Nexus
• Strong knowledge of shell scripting and file systems
Nice to Have
• Healthcare or financial services domain experience
• NoSQL databases — HBASE, DynamoDB
• Microservices and service-oriented architecture
• OpenShift, container orchestration platforms
Job Description
We are seeking a highly motivated and experienced Data Engineer to join our team, focusing on building, optimizing, and deploying robust, scalable data solutions. The ideal candidate will possess deep expertise in Python and PySpark to drive complex data transformations and support high-volume, performance-critical data initiatives.
Key Responsibilities
• Design, build, and maintain high-performance ETL/ELT data pipelines using Python and PySpark
• Apply expertise in Python data analysis libraries including Pandas and NumPy
• Develop and manage data processing jobs leveraging PySpark for distributed computing across large-scale datasets
• Implement DevOps practices and tooling for automated deployment and orchestration of Python applications
• Utilize AWS or other cloud services to architect and maintain cloud-based data ecosystems
• Write and optimize complex SQL queries for data extraction, integrity checks, and performance tuning
• Collaborate with data scientists and analysts to ensure data quality, availability, and consistency
Required Technical Skills
• 5+ years of experience in Data Engineering
• Expert-level proficiency in Python — Pandas, NumPy
• Solid hands-on experience with PySpark for building scalable data workflows
• AWS — S3, Glue, EMR, Snowflake, Lambda strongly preferred
• Docker, Kubernetes, Terraform, CloudFormation
• Advanced SQL and data warehousing knowledge
• Apache Spark, Hadoop ecosystem
• CI/CD tools — Jenkins, Git, Artifactory/Nexus
• Strong knowledge of shell scripting and file systems
Nice to Have
• Healthcare or financial services domain experience
• NoSQL databases — HBASE, DynamoDB
• Microservices and service-oriented architecture
• OpenShift, container orchestration platforms






