SGI

Lead Data Engineer - Databricks, PySpark & AWS

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Lead Data Engineer specializing in Databricks, PySpark, and AWS, with a 12-month contract based in Houston, TX. Requires 8+ years of experience, strong coding skills, and expertise in data modernization and financial data domains.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
January 29, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
On-site
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
Houston, TX
-
🧠 - Skills detailed
#Kafka (Apache Kafka) #Databricks #Observability #SQL (Structured Query Language) #Consulting #Cloud #Data Engineering #PySpark #Python #Leadership #Data Pipeline #Batch #Migration #Delta Lake #Monitoring #Scala #Spark (Apache Spark) #AWS (Amazon Web Services) #Kubernetes
Role description
Lead Databricks Engineer (Contract) Location: Houston, TX - The Galleria (100% Onsite) Contract Length: 12 months (extensions likely) Experience: 8+ years Open to GC holders or US Citizens Overview We are seeking a Lead Databricks Engineer to support large-scale data platform initiatives focused on data modernization, cloud migration, and advanced analytics. This is a hands-on senior role requiring deep Databricks expertise, strong AWS experience, and the ability to partner closely with business and technical stakeholders. The ideal consultant has led complex Databricks implementations end-to-end, can code at a high level, and has experience modernizing legacy data platforms into scalable, cloud-native architectures. Key Responsibilities • Lead the design and implementation of enterprise-scale Databricks solutions on AWS • Drive data modernization initiatives, including lift-and-shift and re-architecture of legacy data platforms • Build and optimize data pipelines using Python, PySpark, and SQL • Design and manage Delta Lake architectures and implement Unity Catalog for governance and access control • Develop and support streaming and batch data workloads • Configure and optimize Databricks clusters and serverless compute for performance and cost • Integrate Databricks with upstream and downstream systems (APIs, data sources, analytics tools) • Partner with stakeholders to gather user and business requirements and translate them into technical solutions • Implement observability, monitoring, and cost controls, including usage, volume, and pricing metrics • Support data domains related to billing, accounting, pricing, and volume metrics • Provide technical leadership, best practices, and mentorship to engineering teams Required Skills & Experience • 8+ years of experience in data engineering, with recent hands-on focus on Databricks • Strong experience deploying Databricks on AWS • Advanced coding skills in Python, PySpark, and SQL • Deep knowledge of Delta Lake, Unity Catalog, and Databricks workspace governance • Experience with streaming data (Structured Streaming, Kafka, or similar) • Strong understanding of Databricks cluster management, serverless compute, and performance tuning • Experience integrating Databricks with enterprise systems and data sources • Proven ability to work directly with business and technical stakeholders • Experience supporting financial data domains (billing, accounting, pricing, usage metrics) is highly preferred • Strong communication skills and ability to lead technical discussions • Experience with AKS / Kubernetes environments Nice to Have • Databricks or AWS certifications • Consulting or contracting background in large enterprise environments Work Requirements • 100% onsite role in Houston, TX • Must be authorized to work in the United States