IPolarity

Data Engineer (Spark | Hadoop | Apache Ozone)

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer (Spark | Hadoop | Apache Ozone) with a contract length of "X months", offering a pay rate of "$X per hour". Key skills include Apache Spark, Hadoop, Apache Ozone, and programming in Python, Scala, or Java. A Bachelor's degree in a related field is required.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
Unknown
-
πŸ—“οΈ - Date
March 3, 2026
πŸ•’ - Duration
Unknown
-
🏝️ - Location
Unknown
-
πŸ“„ - Contract
Unknown
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
Berkeley Heights, NJ
-
🧠 - Skills detailed
#"ETL (Extract #Transform #Load)" #Azure #Data Science #Data Processing #AWS (Amazon Web Services) #Apache Ozone #Data Security #Big Data #Storage #Scripting #Python #Compliance #Data Engineering #Scala #Linux #HBase #Data Pipeline #Batch #Java #Security #Data Quality #Cloud #GCP (Google Cloud Platform) #Programming #Computer Science #Spark (Apache Spark) #Automation #Kubernetes #HDFS (Hadoop Distributed File System) #Apache Spark #Hadoop #YARN (Yet Another Resource Negotiator) #Docker #Kafka (Apache Kafka) #Unix #Data Storage #Shell Scripting #SQL (Structured Query Language)
Role description
Responsibilities: β€’ Design and implement scalable distributed data processing solutions using Apache Spark and the Hadoop ecosystem. β€’ Build and maintain Spark applications for ETL, aggregation, and large-scale data transformation. β€’ Implement and manage enterprise data storage using Apache Ozone and HDFS. β€’ Develop batch and real-time ingestion pipelines using modern big data technologies. β€’ Optimize cluster performance, storage efficiency, and resource utilization. β€’ Ensure data quality, governance, security, and compliance across platforms. β€’ Troubleshoot performance issues across distributed environments. β€’ Collaborate with Data Scientists, Analysts, and Application teams to deliver reliable data solutions. β€’ Automate workflows and operational processes using scripting and orchestration tools. Required Skills: βœ” Strong experience with Apache Spark (Core, SQL, Streaming). βœ” Hands-on expertise with Hadoop ecosystem (HDFS, YARN, MapReduce). βœ” Experience working with Apache Ozone object storage. βœ” Programming skills in Python, Scala, or Java. βœ” Experience building scalable ETL/Data Pipelines. βœ” Knowledge of distributed systems and cluster optimization. βœ” Strong Linux/Unix and shell scripting experience. βœ” Understanding of data security, governance, and compliance practices. Preferred Skills: β€’ Hive, HBase, or Kafka experience. β€’ Cloud-based big data platforms (AWS, Azure, or GCP). β€’ Containerization exposure (Docker, Kubernetes). β€’ CI/CD and automation for data engineering workflows. Qualifications: β€’ Bachelor’s degree in computer science, Software Engineering, or related field. β€’ Experience delivering enterprise data platform or product implementations preferred. β€’ Excellent communication and collaboration skills, and Strong problem-solving mindset and analytical thinking.