ClifyX

Data Engineer (W2 Only)

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with 8+ years of experience, offering a 12+ month contract in Chicago, IL (Hybrid). Key skills include PySpark, ETL/ELT, cloud platforms (AWS, Azure, GCP), and data modeling.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
October 17, 2025
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
Chicago, IL
-
🧠 - Skills detailed
#Databases #Databricks #PySpark #Big Data #Data Ingestion #Monitoring #Data Engineering #Data Modeling #Azure #Documentation #AWS EMR (Amazon Elastic MapReduce) #Datasets #ADLS (Azure Data Lake Storage) #Scala #Data Quality #Data Science #Dataflow #Data Architecture #Data Vault #Vault #Azure Databricks #Data Pipeline #Data Accuracy #Snowflake #Quality Assurance #S3 (Amazon Simple Storage Service) #GCP (Google Cloud Platform) #Cloud #AWS (Amazon Web Services) #Data Transformations #"ETL (Extract #Transform #Load)" #Spark (Apache Spark) #Data Processing #ML (Machine Learning)
Role description
Hello, Greetings from Clifyx. • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Visa (GC/USC) • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Title: Data Engineer Location: Chicago, IL (Hybrid-Onsite) 12+Months Contract Minimum years of experience: 8+ Years of experience Job Description: Data Pipeline Development: Design, develop, test, and deploy robust and scalable data pipelines using PySpark for data ingestion, transformation, and loading (ETL/ELT) from various sources (e.g., S3, ADLS, databases, APIs, streaming data). Big Data Processing: Utilize PySpark to process large datasets efficiently, handling complex data transformations, aggregations, and data quality checks. Performance Optimization: Optimize PySpark jobs for performance, efficiency, and cost-effectiveness, identifying and resolving bottlenecks. Data Modeling: Collaborate with data architects and analysts to design and implement efficient data models (e.g., star schema, snowflake schema, data vault) for analytical and reporting purposes. Cloud Integration: Work with cloud platforms (AWS, Azure, GCP) and their respective big data services (e.g., AWS EMR, Azure Databricks, strong understanding of medallion, GCP Dataflow/Dataproc) to deploy and manage PySpark applications. Collaboration: Work closely with data scientists, machine learning engineers, and other stakeholders to understand data requirements and deliver solutions that meet business needs. Testing and Quality Assurance: Implement comprehensive unit, integration, and end-to-end tests for data pipelines to ensure data accuracy and reliability. Monitoring and Support: Monitor production data pipelines, troubleshoot issues, and provide ongoing support to ensure data availability and integrity. Documentation: Create and maintain clear and concise documentation for data pipelines, data models, and processes. Innovation: Stay up-to-date with the latest advancements in big data technologies, PySpark, and cloud services, and recommend new tools and approaches. Thanks & Best Regards, Vishal Swami – Clifyx (US IT Recruiter) Contact Number- 908-279-1295 LinkedIn: linkedin.com/in/vishal-swami-790a72179 Headquarters: South Plainfield, NJ – 07080