Novia Infotech

Hadoop QA

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Hadoop Data Lake Automation Engineer in Charlotte, NC/Dallas, TX (Hybrid) for a contract position. Requires 4–5 years in Hadoop ecosystems, automation of data workflows, and proficiency in Python or Scala. Pay rate unspecified.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
November 13, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Dallas, TX
-
🧠 - Skills detailed
#Apache NiFi #Data Governance #Cloud #Metadata #HDFS (Hadoop Distributed File System) #Scripting #Data Engineering #Scala #Data Quality #Sqoop (Apache Sqoop) #Apache Airflow #Documentation #Data Lake #Hadoop #Data Ingestion #NiFi (Apache NiFi) #Airflow #Data Management #GCP (Google Cloud Platform) #Big Data #AWS (Amazon Web Services) #Azure #DevOps #"ETL (Extract #Transform #Load)" #Python #Data Pipeline #Security #Spark (Apache Spark) #Data Catalog #Automation
Role description
Role : Hadoop Data Lake Automation Engineer Location : Charlotte, NC / Dallas, TX (Hybrid) Contract Role Job Summary: We are looking for a skilled and motivated Hadoop Data Lake Automation Engineer with 4–5 years of experience in automating data workflows and processes within Hadoop-based data lake environments. The ideal candidate will be responsible for building scalable automation solutions, optimizing data pipelines, and ensuring efficient data movement and transformation across platforms. Key Responsibilities: • Design and implement automation solutions for data ingestion, transformation, and processing in Hadoop data lake environments. • Develop and maintain scalable data pipelines using tools such as Apache NiFi, Spark, Hive, and Sqoop. • Collaborate with data engineers, analysts, and business stakeholders to understand data requirements and deliver automation solutions. • Monitor and troubleshoot data workflows, ensuring reliability and performance. • Implement best practices for data governance, security, and metadata management. • Maintain documentation for data flows, automation scripts, and operational procedures. • Support production environments and participate in on-call rotations as needed. Required Skills & Qualifications: • 3–5 years of hands-on experience in Hadoop ecosystem (HDFS, Hive, Spark, Sqoop, Oozie, etc.). • Strong experience in automating data lake workflows and ETL processes. • Proficiency in scripting languages such as Python, Shell, or Scala. • Experience with scheduling and orchestration tools (e.g., Apache Airflow, Control-M, AutoSys). • Solid understanding of data modelling, data quality, and performance optimization. • Familiarity with cloud platforms (AWS, Azure, GCP) and big data services. • Excellent problem-solving and communication skills. Preferred Qualifications: • Experience with Apache NiFi or similar data flow tools. • Exposure to CI/CD pipelines and DevOps practices. • Knowledge of data cataloguing and lineage tools.