Programmers.io

Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer in Austin, TX, with a contract length of "unknown" and a pay rate of "unknown." Key skills include Java, AWS, Hadoop, Kafka, Spark/PySpark, and Airflow. Industry experience in data engineering is required.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
February 18, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
On-site
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Austin, TX
-
🧠 - Skills detailed
#Monitoring #Observability #SQL (Structured Query Language) #Batch #AWS S3 (Amazon Simple Storage Service) #"ETL (Extract #Transform #Load)" #Kafka (Apache Kafka) #Airflow #Presto #S3 (Amazon Simple Storage Service) #AWS (Amazon Web Services) #Prometheus #Athena #Grafana #Data Quality #Data Engineering #GIT #HDFS (Hadoop Distributed File System) #Hadoop #Data Pipeline #Trino #PySpark #Compliance #Apache Spark #Spark (Apache Spark) #Data Processing #Apache Airflow #Apache Kafka #Java #Libraries #Python
Role description
One of our leading client is looking for Data Engineer in Austin TX Key Responsibilities: • Design, build, and maintain data pipelines across on-prem Hadoop and AWS • Develop and maintain Java applications, utilities, and data processing libraries • Manage and enhance internal Java libraries used for ingestion, validation, and transformation • Migrate and sync data from on-prem HDFS to AWS S3 • Develop and maintain Airflow DAGs for orchestration and scheduling • Work with Kafka-based streaming pipelines for real-time/near-real-time ingestion • Build and optimize Spark / PySpark jobs for large-scale data processing • Use Hive, Presto/Trino, and Athena for querying and validation • Implement data quality checks, monitoring, and alerting • Support Iceberg tables and AWS external tables • Troubleshoot production issues and ensure SLA compliance • Collaborate with platform, analytics, and observability teams Technical Skills Required: Java (Development, maintenance, build tools like Gradle) AWS (S3, Glue, EMR, Athena, EKS basics) Hadoop/HDFS, Hive Apache Kafka (producers/consumers, topics, streaming ingestion) Apache Spark / PySpark (batch + streaming processing) Apache Airflow (DAG development and maintenance) Python Git and CI/CD workflows Observability tools (Prometheus/Grafana) SQL