Novia Infotech

Hadoop QA

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Hadoop Data Lake Automation Engineer in Charlotte, NC/Dallas, TX (Hybrid) for a contract position. Requires 4–5 years in Hadoop ecosystems, automation of data workflows, and proficiency in Python or Scala. Pay rate unspecified.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

November 13, 2025

🕒 - Duration

Unknown

🏝️ - Location

Hybrid

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Dallas, TX

🧠 - Skills detailed

#Apache NiFi #Data Governance #Cloud #Metadata #HDFS (Hadoop Distributed File System) #Scripting #Data Engineering #Scala #Data Quality #Sqoop (Apache Sqoop) #Apache Airflow #Documentation #Data Lake #Hadoop #Data Ingestion #NiFi (Apache NiFi) #Airflow #Data Management #GCP (Google Cloud Platform) #Big Data #AWS (Amazon Web Services) #Azure #DevOps #"ETL (Extract #Transform #Load)" #Python #Data Pipeline #Security #Spark (Apache Spark) #Data Catalog #Automation

Role description

Role : Hadoop Data Lake Automation Engineer Location : Charlotte, NC / Dallas, TX (Hybrid) Contract Role Job Summary: We are looking for a skilled and motivated Hadoop Data Lake Automation Engineer with 4–5 years of experience in automating data workflows and processes within Hadoop-based data lake environments. The ideal candidate will be responsible for building scalable automation solutions, optimizing data pipelines, and ensuring efficient data movement and transformation across platforms. Key Responsibilities: • Design and implement automation solutions for data ingestion, transformation, and processing in Hadoop data lake environments. • Develop and maintain scalable data pipelines using tools such as Apache NiFi, Spark, Hive, and Sqoop. • Collaborate with data engineers, analysts, and business stakeholders to understand data requirements and deliver automation solutions. • Monitor and troubleshoot data workflows, ensuring reliability and performance. • Implement best practices for data governance, security, and metadata management. • Maintain documentation for data flows, automation scripts, and operational procedures. • Support production environments and participate in on-call rotations as needed. Required Skills & Qualifications: • 3–5 years of hands-on experience in Hadoop ecosystem (HDFS, Hive, Spark, Sqoop, Oozie, etc.). • Strong experience in automating data lake workflows and ETL processes. • Proficiency in scripting languages such as Python, Shell, or Scala. • Experience with scheduling and orchestration tools (e.g., Apache Airflow, Control-M, AutoSys). • Solid understanding of data modelling, data quality, and performance optimization. • Familiarity with cloud platforms (AWS, Azure, GCP) and big data services. • Excellent problem-solving and communication skills. Preferred Qualifications: • Experience with Apache NiFi or similar data flow tools. • Exposure to CI/CD pipelines and DevOps practices. • Knowledge of data cataloguing and lineage tools.

Apply now Apply with DFH Sign up

← See all roles