Staffing Technologies

Lead Data Engineer (Databricks | Python | PySpark)

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Lead Data Engineer (Databricks | Python | PySpark) with a contract length of "unknown" and pay rate of "unknown". Must be located in the EST time zone. Requires 15+ years of experience, strong skills in Databricks, Python, PySpark, SQL, and DevOps practices.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

January 31, 2026

🕒 - Duration

Unknown

🏝️ - Location

Unknown

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

United States

🧠 - Skills detailed

#Cloud #Code Reviews #Data Ingestion #Data Quality #Data Engineering #GIT #Scala #Infrastructure as Code (IaC) #Databricks #Python #Terraform #DevOps #Model Optimization #Spark (Apache Spark) #Deployment #Scrum #Metadata #Monitoring #Leadership #PySpark #Data Processing #Data Lineage #Data Warehouse #"ETL (Extract #Transform #Load)" #SQL (Structured Query Language) #Agile

Role description

Lead Data Engineer (Databricks | Python | PySpark) MUST be EST time zone. Overview We are seeking a Lead Data Engineer to design, build, and lead modern cloud-based data platforms with a strong focus on Databricks, Python, and PySpark. This role combines hands-on engineering with technical leadership, owning architecture decisions, delivery standards, and scalable data solutions. Key Responsibilities • Lead the design and delivery of cloud-native data platforms using Databricks • Architect and implement Lakehouse and Data Warehouse patterns • Build and optimize ETL/ELT pipelines using Python and PySpark • Establish engineering standards, reusable frameworks, and metadata-driven orchestration • Review designs, vet solutions with the team, and lead demos and retros prior to deployment • Enforce data quality, lineage, monitoring, and alerting across pipelines • Mentor engineers and provide hands-on technical leadership • Partner with analytics and business teams to align solutions with data and reporting needs Required Experience & Skills Core Experience • ~15 years of total experience in data or software engineering • 3+ years in a technical lead role • 5+ years building cloud-based data platforms • Proven delivery of production-grade, scalable data systems • Excellent Communication Skills are critical here. Databricks (Strong Focus) • Hands-on experience with Databricks Notebooks, Jobs, and workload optimization • Building pipelines using Lakeflow / Declarative Pipelines • Data ingestion via Databricks connectors • Implementing data lineage, quality checks, monitoring, and alerting • Table, compute, and performance optimization within Databricks Python, PySpark & Spark • Advanced Python with strong packaging and dependency management • Expert PySpark for distributed data processing • Clear understanding of Spark vs single-node execution • Spark performance tuning and troubleshooting SQL • Strong SQL for mid-to-complex transformations • Query and data model optimization to reduce compute and improve performance Engineering & DevOps Practices • Strong adherence to SOLID and DRY principles • Experience building parameterized, reusable frameworks • Agile/SCRUM delivery experience • Git-based development workflows and code reviews • Testing strategies: unit, integration, and end-to-end • CI/CD pipelines and Infrastructure as Code (Terraform)

Apply now Apply with DFH

← See all roles