Lead Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Lead Data Engineer with a contract length of "unknown," offering a pay rate of "$X per hour." Required skills include Databricks, Apache Kafka, AWS services, and Python. A Bachelor's degree and 7+ years of data engineering experience are essential.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

🗓️ - Date discovered

August 21, 2025

🕒 - Project duration

Unknown

🏝️ - Location type

Unknown

📄 - Contract type

Unknown

🔒 - Security clearance

Unknown

📍 - Location detailed

Dallas, TX

🧠 - Skills detailed

#IAM (Identity and Access Management) #Code Reviews #AutoScaling #Storage #Data Engineering #Apache Kafka #"ETL (Extract #Transform #Load)" #GCP (Google Cloud Platform) #Leadership #Data Quality #Batch #Data Ingestion #Lambda (AWS Lambda) #Data Science #SQL (Structured Query Language) #AWS (Amazon Web Services) #Python #Data Processing #PySpark #VPC (Virtual Private Cloud) #Data Lake #Databricks #Cloud #S3 (Amazon Simple Storage Service) #Deployment #Security #Spark (Apache Spark) #Kafka (Apache Kafka) #Azure #Compliance #Data Catalog #EC2 #Scala #Data Architecture #Spark SQL #Data Pipeline #Data Governance #Computer Science #Delta Lake

Role description

Job Summary: As a Databricks Lead, you will be a critical member of our data engineering team, responsible for designing, developing, and optimizing our data pipelines and platforms on Databricks, primarily leveraging AWS services. You will play a key role in implementing robust data governance with Unity Catalog and ensuring cost-effective data solutions. This role requires a strong technical leader who can mentor junior engineers, drive best practices, and contribute hands-on to complex data challenges. Responsibilities: • Databricks Platform Leadership: • Lead the design, development, and deployment of large-scale data solutions on the Databricks platform. • Establish and enforce best practices for Databricks usage, including notebook development, job orchestration, and cluster management. • Stay abreast of the latest Databricks features and capabilities, recommending and implementing improvements. • Data Ingestion and Streaming (Kafka): • Architect and implement real-time and batch data ingestion pipelines using Apache Kafka for high-volume data streams. • Integrate Kafka with Databricks for seamless data processing and analysis. • Optimize Kafka consumers and producers for performance and reliability. • Data Governance and Management (Unity Catalog): • Implement and manage data governance policies and access controls using Databricks Unity Catalog. • Define and enforce data cataloging, lineage, and security standards within the Databricks Lakehouse. • Collaborate with data governance teams to ensure compliance and data quality. • AWS Cloud Integration: • Leverage various AWS services (S3, EC2, Lambda, Glue, etc.) to build a robust and scalable data infrastructure. • Manage and optimize AWS resources for Databricks workloads. • Ensure secure and compliant integration between Databricks and AWS. • Cost Optimization: • Proactively identify and implement strategies for cost optimization across Databricks and AWS resources. • Monitor DBU consumption, cluster utilization, and storage costs, providing recommendations for efficiency gains. • Implement autoscaling, auto-termination, and right-sizing strategies to minimize operational expenses. • Technical Leadership & Mentoring: • Provide technical guidance and mentorship to a team of data engineers. • Conduct code reviews, promote coding standards, and foster a culture of continuous improvement. • Lead technical discussions and decision-making for complex data engineering problems. • Data Pipeline Development & Optimization: • Develop, test, and maintain robust and efficient ETL/ELT pipelines using PySpark/Spark SQL. • Optimize Spark jobs for performance, scalability, and resource utilization. • Troubleshoot and resolve complex data pipeline issues. • Collaboration: • Work closely with data scientists, analysts, and other engineering teams to understand data requirements and deliver solutions. • Communicate technical concepts effectively to both technical and non-technical stakeholders. Qualifications: • Bachelor's or Master's degree in Computer Science, Data Engineering, or a related quantitative field. • 7+ years of experience in data engineering, with at least 3+ years in a lead or senior role. • Proven expertise in designing and implementing data solutions on Databricks. • Strong hands-on experience with Apache Kafka for real-time data streaming. • In-depth knowledge and practical experience with Databricks Unity Catalog for data governance and access control. • Solid understanding of AWS cloud services and their application in data architectures (S3, EC2, Lambda, VPC, IAM, etc.). • Demonstrated ability to optimize cloud resource usage and implement cost-saving strategies. • Proficiency in Python and Spark (PySpark/Spark SQL) for data processing and analysis. • Experience with Delta Lake and other modern data lake formats. • Excellent problem-solving, analytical, and communication skills. Added Advantage (Bonus Skills): • Experience with Apache Flink for stream processing. • Databricks certifications. • Experience with CI/CD pipelines for Databricks deployments. • Knowledge of other cloud platforms (Azure, GCP) is a plus.

Apply now Apply with DFH Sign up

← See all roles

Go to role

Lead Data Engineer

Premium Members Land Roles Faster—Upgrade today.

DevOps Engineer (remote) - NEED ONLY W2 OR 1099

Endevor/changeman Engineer

Data Reporting Analyst

JAMA Lead Developer

Premium Members Land Roles Faster—Upgrade today.

Book a

chat

with us

Company