

New York Technology Partners
Data Engineer (PySpark)
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer (PySpark) with a contract length of "unknown," offering a pay rate of "unknown," and requiring expertise in PySpark, Python, SQL, and cloud platforms. Familiarity with Databricks and experience in client-facing roles are essential.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
Unknown
-
ποΈ - Date
April 29, 2026
π - Duration
Unknown
-
ποΈ - Location
Unknown
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
Irvine, CA
-
π§ - Skills detailed
#Databricks #Version Control #SQL (Structured Query Language) #"ETL (Extract #Transform #Load)" #PySpark #Data Pipeline #Datasets #GIT #Scala #Kafka (Apache Kafka) #Data Engineering #Data Warehouse #Data Modeling #Spark (Apache Spark) #Python #Redshift #BitBucket #GitHub #Airflow #AWS (Amazon Web Services) #Cloud
Role description
We are looking for a Data Engineer to design, build, and optimize scalable data pipelines and analytics solutions in a cloud-based environment. The role involves working with large-scale data systems, modern data platforms, and collaborating closely with business stakeholders and engineering teams in a client-facing setting.
Key Responsibilities:
β’ Design, develop, and maintain scalable data pipelines and ETL processes
β’ Work with large datasets using distributed processing frameworks like PySpark
β’ Build and optimize data solutions on platforms such as Databricks and Kafka
β’ Implement data models, warehousing solutions, and governance best practices
β’ Collaborate with cross-functional teams and client stakeholders to deliver solutions
β’ Support offshore teams and contribute to global delivery execution
Required Skills:
β’ Strong hands-on experience in PySpark, Python, SQL, and Hive
β’ Cloud experience (preferably AWS)
β’ Experience with Databricks and workflow orchestration tools like Airflow
β’ Familiarity with MPP data warehouses (e.g., Redshift or similar)
β’ Solid understanding of data warehousing and data modeling concepts
β’ Proficiency with Git-based version control (GitHub/Bitbucket)
β’ Strong communication and client-facing experience
β’ Experience working in distributed/offshore team environments
We are looking for a Data Engineer to design, build, and optimize scalable data pipelines and analytics solutions in a cloud-based environment. The role involves working with large-scale data systems, modern data platforms, and collaborating closely with business stakeholders and engineering teams in a client-facing setting.
Key Responsibilities:
β’ Design, develop, and maintain scalable data pipelines and ETL processes
β’ Work with large datasets using distributed processing frameworks like PySpark
β’ Build and optimize data solutions on platforms such as Databricks and Kafka
β’ Implement data models, warehousing solutions, and governance best practices
β’ Collaborate with cross-functional teams and client stakeholders to deliver solutions
β’ Support offshore teams and contribute to global delivery execution
Required Skills:
β’ Strong hands-on experience in PySpark, Python, SQL, and Hive
β’ Cloud experience (preferably AWS)
β’ Experience with Databricks and workflow orchestration tools like Airflow
β’ Familiarity with MPP data warehouses (e.g., Redshift or similar)
β’ Solid understanding of data warehousing and data modeling concepts
β’ Proficiency with Git-based version control (GitHub/Bitbucket)
β’ Strong communication and client-facing experience
β’ Experience working in distributed/offshore team environments






