

Jobs via Dice
Data Engineer - Manhattan West, NY / Naveen
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer (Intermediate) on a contract-to-hire basis in Manhattan West, NY. Key skills include ETL pipeline development, AWS, Snowflake, PySpark, and Apache Airflow. Strong experience with data lake architectures and large-scale data processing is required.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
May 28, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
On-site
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
New York, NY
-
🧠 - Skills detailed
#Data Science #Data Warehouse #Data Processing #Automation #Java #Scala #Data Engineering #Data Pipeline #Programming #PySpark #EC2 #Data Ingestion #Apache Airflow #AWS (Amazon Web Services) #Snowflake #S3 (Amazon Simple Storage Service) #Apache Spark #Documentation #Data Modeling #"ETL (Extract #Transform #Load)" #Security #Spark (Apache Spark) #Data Quality #Datasets #Airflow #Data Lake
Role description
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Ampcus Inc, is seeking the following. Apply via Dice today!
Title: Data Engineer Intermediate
Position Type: Contract to hire
Location: Manhattan West, NY- Onsite
Onsite
Job Summary
We are seeking a skilled Data Engineer to design, build, and manage scalable ETL pipelines supporting a centralized data lake and Snowflake data warehouse. The role focuses on automating data ingestion, transformation, and aggregation workflows to enable reliable analytics and data-driven decision-making.
Key Responsibilities
• Design, develop, and maintain robust ETL pipelines for ingesting data into the enterprise data lake and Snowflake environment.
• Automate data processing, aggregation, and analytical workflows to improve data availability and performance.
• Implement and manage orchestration and scheduling of data pipelines using ControlM and Apache Airflow.
• Develop scalable data transformation logic using PySpark and Apache Spark (Java).
• Work with large, structured and semi-structured datasets on AWS infrastructure.
• Ensure data quality, integrity, and reliability across data pipelines.
• Optimize data pipelines for performance, cost, and scalability.
• Collaborate with analytics, data science, and business teams to understand data requirements.
• Monitor, troubleshoot, and resolve pipeline failures and performance bottlenecks.
• Follow best practices for data engineering, security, and documentation.
Required Skills & Qualifications
• Strong experience with data lake architectures and large-scale data processing.
• Hands-on experience with AWS services (e.g., S3, EC2, EMR, Glue, or related).
• Proven expertise in building ETL pipelines for analytics and reporting use cases.
• Solid working knowledge of Snowflake, including data loading, transformations, and performance optimization.
• Experience with workflow automation and scheduling tools such as ControlM and Apache Airflow.
• Proficiency in PySpark for distributed data processing.
• Strong programming experience with Apache Spark using Java.
• Good understanding of data modeling, partitioning, and performance tuning concepts.
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Ampcus Inc, is seeking the following. Apply via Dice today!
Title: Data Engineer Intermediate
Position Type: Contract to hire
Location: Manhattan West, NY- Onsite
Onsite
Job Summary
We are seeking a skilled Data Engineer to design, build, and manage scalable ETL pipelines supporting a centralized data lake and Snowflake data warehouse. The role focuses on automating data ingestion, transformation, and aggregation workflows to enable reliable analytics and data-driven decision-making.
Key Responsibilities
• Design, develop, and maintain robust ETL pipelines for ingesting data into the enterprise data lake and Snowflake environment.
• Automate data processing, aggregation, and analytical workflows to improve data availability and performance.
• Implement and manage orchestration and scheduling of data pipelines using ControlM and Apache Airflow.
• Develop scalable data transformation logic using PySpark and Apache Spark (Java).
• Work with large, structured and semi-structured datasets on AWS infrastructure.
• Ensure data quality, integrity, and reliability across data pipelines.
• Optimize data pipelines for performance, cost, and scalability.
• Collaborate with analytics, data science, and business teams to understand data requirements.
• Monitor, troubleshoot, and resolve pipeline failures and performance bottlenecks.
• Follow best practices for data engineering, security, and documentation.
Required Skills & Qualifications
• Strong experience with data lake architectures and large-scale data processing.
• Hands-on experience with AWS services (e.g., S3, EC2, EMR, Glue, or related).
• Proven expertise in building ETL pipelines for analytics and reporting use cases.
• Solid working knowledge of Snowflake, including data loading, transformations, and performance optimization.
• Experience with workflow automation and scheduling tools such as ControlM and Apache Airflow.
• Proficiency in PySpark for distributed data processing.
• Strong programming experience with Apache Spark using Java.
• Good understanding of data modeling, partitioning, and performance tuning concepts.





