New York Technology Partners

Data Engineer (PySpark)

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer (PySpark) on a contract basis, requiring expertise in PySpark, SQL, and AWS. Key skills include building data pipelines, optimizing workflows, and collaborating with global teams. Experience with Airflow and data warehousing is essential.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
Unknown
-
πŸ—“οΈ - Date
April 14, 2026
πŸ•’ - Duration
Unknown
-
🏝️ - Location
Unknown
-
πŸ“„ - Contract
Unknown
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
Irvine, CA
-
🧠 - Skills detailed
#Data Quality #GIT #AWS (Amazon Web Services) #Datasets #Kafka (Apache Kafka) #Version Control #Data Engineering #PySpark #Cloud #Spark (Apache Spark) #Data Modeling #Data Warehouse #SQL (Structured Query Language) #Data Lake #Distributed Computing #Data Processing #Python #Data Pipeline #Redshift #Databricks #Data Governance #Scala #Airflow
Role description
We are looking for a talented Data Engineer to design and deliver scalable data solutions that power analytics and business insights. This role requires deep technical expertise in modern data engineering practices, along with the ability to collaborate effectively with both internal teams and external stakeholders. The ideal candidate is experienced in building robust data pipelines, working with large-scale distributed systems, and optimizing performance across cloud-based data platforms. Key Responsibilities β€’ Design, build, and maintain scalable data pipelines and data processing frameworks β€’ Process and manage large datasets using distributed computing technologies β€’ Partner with cross-functional teams and stakeholders to understand data needs and deliver effective solutions β€’ Optimize and tune data workflows across platforms such as Databricks and Kafka β€’ Implement best practices in data modeling, data warehousing, and data governance β€’ Develop and manage workflow orchestration using tools like Airflow β€’ Contribute to global delivery efforts, including coordination with offshore teams Required Qualifications β€’ Strong hands-on experience with: β€’ PySpark β€’ Hive β€’ SQL β€’ Python β€’ Experience working with cloud platforms (AWS preferred) β€’ Proficiency with workflow orchestration tools (e.g., Airflow) β€’ Experience using version control systems such as Git-based platforms β€’ Familiarity with MPP data warehouses (e.g., Redshift or similar technologies) β€’ Solid understanding of data warehousing principles and data modeling techniques β€’ Exposure to Databricks and modern data lake architectures β€’ Strong communication skills with the ability to work in client-facing environments β€’ Experience collaborating with distributed and offshore teams What We’re Looking For β€’ A problem-solver who can build efficient, reliable, and scalable data systems β€’ Strong attention to performance optimization and data quality β€’ Ability to translate business requirements into technical data solutions β€’ Collaborative mindset with experience working across global teams β€’ Ownership mentality with a focus on delivering high-quality outcomes