

Novia Infotech
Hadoop QA
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Hadoop Data Lake Automation Engineer in Charlotte, NC/Dallas, TX (Hybrid) for a contract position. Requires 4–5 years in Hadoop ecosystems, automation of data workflows, and proficiency in Python or Scala. Pay rate unspecified.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
November 13, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Dallas, TX
-
🧠 - Skills detailed
#Apache NiFi #Data Governance #Cloud #Metadata #HDFS (Hadoop Distributed File System) #Scripting #Data Engineering #Scala #Data Quality #Sqoop (Apache Sqoop) #Apache Airflow #Documentation #Data Lake #Hadoop #Data Ingestion #NiFi (Apache NiFi) #Airflow #Data Management #GCP (Google Cloud Platform) #Big Data #AWS (Amazon Web Services) #Azure #DevOps #"ETL (Extract #Transform #Load)" #Python #Data Pipeline #Security #Spark (Apache Spark) #Data Catalog #Automation
Role description
Role : Hadoop Data Lake Automation Engineer
Location : Charlotte, NC / Dallas, TX (Hybrid)
Contract Role
Job Summary:
We are looking for a skilled and motivated Hadoop Data Lake Automation Engineer with 4–5 years of experience in automating data workflows and processes within Hadoop-based data lake environments. The ideal candidate will be responsible for building scalable automation solutions, optimizing data pipelines, and ensuring efficient data movement and transformation across platforms.
Key Responsibilities:
• Design and implement automation solutions for data ingestion, transformation, and processing in Hadoop data lake environments.
• Develop and maintain scalable data pipelines using tools such as Apache NiFi, Spark, Hive, and Sqoop.
• Collaborate with data engineers, analysts, and business stakeholders to understand data requirements and deliver automation solutions.
• Monitor and troubleshoot data workflows, ensuring reliability and performance.
• Implement best practices for data governance, security, and metadata management.
• Maintain documentation for data flows, automation scripts, and operational procedures.
• Support production environments and participate in on-call rotations as needed.
Required Skills & Qualifications:
• 3–5 years of hands-on experience in Hadoop ecosystem (HDFS, Hive, Spark, Sqoop, Oozie, etc.).
• Strong experience in automating data lake workflows and ETL processes.
• Proficiency in scripting languages such as Python, Shell, or Scala.
• Experience with scheduling and orchestration tools (e.g., Apache Airflow, Control-M, AutoSys).
• Solid understanding of data modelling, data quality, and performance optimization.
• Familiarity with cloud platforms (AWS, Azure, GCP) and big data services.
• Excellent problem-solving and communication skills.
Preferred Qualifications:
• Experience with Apache NiFi or similar data flow tools.
• Exposure to CI/CD pipelines and DevOps practices.
• Knowledge of data cataloguing and lineage tools.
Role : Hadoop Data Lake Automation Engineer
Location : Charlotte, NC / Dallas, TX (Hybrid)
Contract Role
Job Summary:
We are looking for a skilled and motivated Hadoop Data Lake Automation Engineer with 4–5 years of experience in automating data workflows and processes within Hadoop-based data lake environments. The ideal candidate will be responsible for building scalable automation solutions, optimizing data pipelines, and ensuring efficient data movement and transformation across platforms.
Key Responsibilities:
• Design and implement automation solutions for data ingestion, transformation, and processing in Hadoop data lake environments.
• Develop and maintain scalable data pipelines using tools such as Apache NiFi, Spark, Hive, and Sqoop.
• Collaborate with data engineers, analysts, and business stakeholders to understand data requirements and deliver automation solutions.
• Monitor and troubleshoot data workflows, ensuring reliability and performance.
• Implement best practices for data governance, security, and metadata management.
• Maintain documentation for data flows, automation scripts, and operational procedures.
• Support production environments and participate in on-call rotations as needed.
Required Skills & Qualifications:
• 3–5 years of hands-on experience in Hadoop ecosystem (HDFS, Hive, Spark, Sqoop, Oozie, etc.).
• Strong experience in automating data lake workflows and ETL processes.
• Proficiency in scripting languages such as Python, Shell, or Scala.
• Experience with scheduling and orchestration tools (e.g., Apache Airflow, Control-M, AutoSys).
• Solid understanding of data modelling, data quality, and performance optimization.
• Familiarity with cloud platforms (AWS, Azure, GCP) and big data services.
• Excellent problem-solving and communication skills.
Preferred Qualifications:
• Experience with Apache NiFi or similar data flow tools.
• Exposure to CI/CD pipelines and DevOps practices.
• Knowledge of data cataloguing and lineage tools.






