Mastech Digital

Lead Data Engineer-Apache Iceberg

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Lead Data Engineer specializing in Apache Iceberg, with a contract length of "unknown," offering a pay rate of "unknown." Key skills include ETL workflows, Apache Spark, and data migration from Cloudera Hadoop.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
October 16, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Strongsville, OH
-
🧠 - Skills detailed
#Monitoring #Migration #Apache Iceberg #Strategy #Scala #Presto #Data Management #Datasets #Cloudera #Metadata #Apache Spark #Cloud #Data Pipeline #Spark (Apache Spark) #Data Quality #Data Ingestion #Data Engineering #HDFS (Hadoop Distributed File System) #"ETL (Extract #Transform #Load)" #Storage #Impala #Trino #Hadoop #Logging
Role description
Skills/experience: Lead the migration of datasets and ETL workflows from Cloudera Hadoop (Hive, Impala, HDFS, etc.) to an Apache Iceberg based architecture. Analyze existing data pipelines and storage formats (e.g., Parquet, ORC) to plan and execute a smooth migration strategy. Design and implement scalable data ingestion and transformation pipelines using Apache Spark, Flink, or equivalent tools. Optimize data partitioning, schema evolution, compaction, and metadata management using Iceberg best practices. Integrate Iceberg tables with query engines like Trino or Presto to support data analytics use cases. Ensure compatibility and data quality during the migration phase through robust testing, validation, and lineage tracking. Establish monitoring, logging, and performance tuning for migrated pipelines and Iceberg tables.