Cyberobotix

Data Engineer

โญ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer in Berkeley Heights, NJ, with a contract length of "unknown." Pay rate is "unknown." Key skills include AWS, Python, PySpark, SQL, and experience with data modeling and big data technologies.
๐ŸŒŽ - Country
United States
๐Ÿ’ฑ - Currency
$ USD
-
๐Ÿ’ฐ - Day rate
Unknown
-
๐Ÿ—“๏ธ - Date
April 23, 2026
๐Ÿ•’ - Duration
Unknown
-
๐Ÿ๏ธ - Location
On-site
-
๐Ÿ“„ - Contract
Unknown
-
๐Ÿ”’ - Security
Unknown
-
๐Ÿ“ - Location detailed
New Jersey, United States
-
๐Ÿง  - Skills detailed
#Scala #SQL (Structured Query Language) #Spark (Apache Spark) #PySpark #SageMaker #Security #Spark SQL #Data Quality #Visualization #Lambda (AWS Lambda) #Big Data #AWS (Amazon Web Services) #Microsoft Power BI #Data Engineering #ML (Machine Learning) #S3 (Amazon Simple Storage Service) #Terraform #BI (Business Intelligence) #TensorFlow #Data Pipeline #Data Lake #AI (Artificial Intelligence) #Infrastructure as Code (IaC) #"ETL (Extract #Transform #Load)" #Snowflake #Redshift #Data Science #Apache Spark #Python #Data Warehouse #Athena #Data Modeling #AWS S3 (Amazon Simple Storage Service) #Cloud #Hadoop #Batch
Role description
Job Title: Lead Data Engineer / Sr Data Engineer Location: Berkeley Heights, NJ Key Skills Required ยท AWS (S3, Redshift, Glue, Lambda, EMR, Athena) ยท Data Engineering & Data Modeling (Star Schema, Snowflake, Dimensional Modeling) ยท Python, PySpark, SQL ยท Big Data Technologies (Hadoop, Spark) ยท Infrastructure as Code (Terraform) ยท AI/ML integration basics ยท Visualization tools (Power BI) Roles & Responsibilities ยท Design, develop, and maintain scalable data pipelines for batch and real-time processing using AWS services ยท Build and optimize data lakes and data warehouses using Amazon S3, Redshift, and Glue ยท Develop robust ETL/ELT pipelines using Python, PySpark, and SQL ยท Implement efficient data modeling techniques such as star schema and dimensional modeling ยท Work with large-scale distributed systems using Hadoop and Apache Spark ยท Integrate AI/ML models into data pipelines to support advanced analytics ยท Automate infrastructure provisioning using Terraform (IaC) ยท Ensure data quality, governance, and security across pipelines ยท Collaborate with cross-functional teams including data scientists, analysts, and business stakeholders ยท Develop dashboards and reports using Power BI for business insights ยท Monitor and optimize performance of data pipelines and cloud resources. ยท Exposure to AI/ML frameworks (SageMaker, TensorFlow, etc.)