

Techgene Solutions
Data Engineer (Spark + SQL + Redshift/Snowflake + ETL Datalake)
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with 8+ years of experience in Spark, SQL, and cloud data warehousing. It is a hybrid position in Westwood, MA / Johnston, RI, offering a competitive pay rate. Key skills include ETL development and Data Lake technologies.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
June 24, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Westwood, MA
-
🧠 - Skills detailed
#Data Pipeline #Data Processing #Automation #Airflow #Agile #Kafka (Apache Kafka) #Data Engineering #S3 (Amazon Simple Storage Service) #SQL Queries #Datasets #Scala #Big Data #Data Lake #Redshift #Data Architecture #Batch #Data Integration #Deployment #Monitoring #Apache Spark #Amazon Redshift #"ETL (Extract #Transform #Load)" #Data Quality #AWS EMR (Amazon Elastic MapReduce) #SQL (Structured Query Language) #AWS S3 (Amazon Simple Storage Service) #Spark (Apache Spark) #Cloud #Data Modeling #PySpark #Snowflake #AWS (Amazon Web Services)
Role description
Job Title: Data Engineer (Spark + SQL + Redshift/Snowflake + ETL/Data Lake)
Location: Westwood, MA / Johnston, RI (Hybrid / Onsite)
Job Description:
We are seeking an experienced Data Engineer with strong expertise in building scalable data platforms and modern data pipelines. The ideal candidate will have hands-on experience in Spark (PySpark/Scala), SQL, cloud data warehousing, ETL/ELT frameworks, and Data Lake architectures supporting both batch and real-time data processing.
Key Responsibilities:
• Design, develop, and maintain scalable data pipelines for ingesting, transforming, and processing large-scale datasets.
• Build and optimize batch and streaming data solutions using Apache Spark (PySpark/Scala).
• Develop robust ETL/ELT workflows and integrate data from multiple internal and external sources.
• Work with cloud-based Data Lake architectures using AWS S3, Parquet, Iceberg, and related technologies.
• Design and maintain data models and solutions using Amazon Redshift and/or Snowflake.
• Implement and support CDC (Change Data Capture) pipelines and event-driven data integrations using Kafka.
• Optimize SQL queries and improve performance, scalability, and data reliability.
• Build and maintain data quality, validation, monitoring, and governance frameworks.
• Collaborate with cross-functional teams including Data Architects, Analytics, and Business stakeholders.
• Support deployment, troubleshooting, and production monitoring of data workflows.
Required Skills:
• 8+ years of experience in Data Engineering / Big Data development.
• Strong hands-on experience with Apache Spark (PySpark and/or Scala).
• Advanced proficiency in SQL development and performance tuning.
• Experience with Amazon Redshift and/or Snowflake.
• Expertise in building ETL/ELT pipelines and orchestration workflows.
• Experience with Data Lake technologies (AWS S3, Parquet, Iceberg).
• Experience implementing Kafka-based streaming solutions and CDC pipelines.
• Experience with AWS EMR and cloud-based data platforms.
• Strong understanding of data modeling, optimization, and distributed processing concepts.
Preferred Qualifications:
• Experience with CI/CD and data pipeline automation.
• Exposure to modern orchestration tools (Airflow or equivalent).
• Experience working in Agile environments.
• Strong communication and stakeholder management skills.
Job Title: Data Engineer (Spark + SQL + Redshift/Snowflake + ETL/Data Lake)
Location: Westwood, MA / Johnston, RI (Hybrid / Onsite)
Job Description:
We are seeking an experienced Data Engineer with strong expertise in building scalable data platforms and modern data pipelines. The ideal candidate will have hands-on experience in Spark (PySpark/Scala), SQL, cloud data warehousing, ETL/ELT frameworks, and Data Lake architectures supporting both batch and real-time data processing.
Key Responsibilities:
• Design, develop, and maintain scalable data pipelines for ingesting, transforming, and processing large-scale datasets.
• Build and optimize batch and streaming data solutions using Apache Spark (PySpark/Scala).
• Develop robust ETL/ELT workflows and integrate data from multiple internal and external sources.
• Work with cloud-based Data Lake architectures using AWS S3, Parquet, Iceberg, and related technologies.
• Design and maintain data models and solutions using Amazon Redshift and/or Snowflake.
• Implement and support CDC (Change Data Capture) pipelines and event-driven data integrations using Kafka.
• Optimize SQL queries and improve performance, scalability, and data reliability.
• Build and maintain data quality, validation, monitoring, and governance frameworks.
• Collaborate with cross-functional teams including Data Architects, Analytics, and Business stakeholders.
• Support deployment, troubleshooting, and production monitoring of data workflows.
Required Skills:
• 8+ years of experience in Data Engineering / Big Data development.
• Strong hands-on experience with Apache Spark (PySpark and/or Scala).
• Advanced proficiency in SQL development and performance tuning.
• Experience with Amazon Redshift and/or Snowflake.
• Expertise in building ETL/ELT pipelines and orchestration workflows.
• Experience with Data Lake technologies (AWS S3, Parquet, Iceberg).
• Experience implementing Kafka-based streaming solutions and CDC pipelines.
• Experience with AWS EMR and cloud-based data platforms.
• Strong understanding of data modeling, optimization, and distributed processing concepts.
Preferred Qualifications:
• Experience with CI/CD and data pipeline automation.
• Exposure to modern orchestration tools (Airflow or equivalent).
• Experience working in Agile environments.
• Strong communication and stakeholder management skills.






