Nasscomm

Data Engineer Lead

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer Lead with a contract length of 4+ months, hybrid location, and a focus on Azure Databricks. Requires 7+ years in data engineering, strong Apache Spark and Delta Lake expertise, and advanced Python and SQL skills.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
Unknown
-
πŸ—“οΈ - Date
March 20, 2026
πŸ•’ - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
πŸ“„ - Contract
Unknown
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
United States
-
🧠 - Skills detailed
#Data Vault #Azure Databricks #BI (Business Intelligence) #Delta Lake #Code Reviews #Data Lake #"ACID (Atomicity #Consistency #Isolation #Durability)" #Observability #Azure ADLS (Azure Data Lake Storage) #Azure Synapse Analytics #Stories #ADF (Azure Data Factory) #Snowflake #Java #Programming #Scala #AWS (Amazon Web Services) #PySpark #ADLS (Azure Data Lake Storage) #Spark (Apache Spark) #Batch #Cloud #Data Quality #ML (Machine Learning) #SQL (Structured Query Language) #Synapse #Azure Data Factory #Vault #Scrum #Security #"ETL (Extract #Transform #Load)" #Azure #Data Engineering #Storage #GCP (Google Cloud Platform) #Python #Data Modeling #Databricks #IAM (Identity and Access Management) #Airflow #Monitoring #Apache Spark
Role description
Job Title: Cloud Tech Lead – DataBricks Location: Location: Hybrid – 2 days/week local Client office Contract: 4+ Months As a Tech Lead, you will work closely with business stakeholders, solution architects, product owner, scrum master, and subject matter experts (SMEs) to understand requirements and deliver high-quality, scalable solutions on Azure Databricks. Responsibilities Summary: β€’ Lead solution design and delivery for Databricks-based platforms and products (batch/streaming, BI-ready models, ML workflows). β€’ Translate business requirements into reference architectures, implementation plans, and sprint-ready technical stories. β€’ Serve as technical decision-maker for tradeoffs (cost, performance, latency, reliability, maintainability). β€’ Design and implement Lakehouse patterns (Bronze/Silver/Gold), medallion architecture, and domain-oriented data products where applicable. β€’ Build and optimize ETL/ELT using Apache Spark, Databricks SQL, and orchestrators (e.g., Workflows, ADF, Airflow). β€’ Implement streaming use cases (e.g., Spark Structured Streaming, Delta Live Tables where appropriate). β€’ Establish data modeling standards (star/snowflake, Data Vault where relevant) and performance tuning practices. β€’ Implement access controls, auditing, and governance with Unity Catalog (RBAC/ABAC patterns, lineage, data sharing policies). β€’ Ensure production readiness: CI/CD, monitoring/alerting, runbooks, incident response, and SLAs/SLOs. β€’ Drive data quality practices (tests, expectations, reconciliation, observability). β€’ Define MLOps standards (experiment tracking, reproducibility, champion/challenger, drift monitoring). β€’ Mentor engineers; conduct design reviews, code reviews, and set engineering standards. β€’ Partner with product owners, data owners, security, and platform teams; communicate status, risks, and options clearly. β€’ Contribute to hiring, onboarding, and capability building. Position Requirements: β€’ 7+ years in data/platform/analytics engineering, including 2+ years leading technical teams or workstreams. β€’ Proven production delivery on Databricks (with strong Azure Databricks experience preferred). β€’ Strong Apache Spark expertise (PySpark/Scala): distributed processing, troubleshooting, and performance tuning. β€’ Deep Delta Lake knowledge: ACID tables, compaction, Z-Ordering, schema evolution, and batch/streaming patterns. β€’ Experience building scalable batch and streaming pipelines, including orchestration/operationalization (scheduling, dependencies, retries, idempotency). β€’ Strong Azure data platform background, including Azure Data Lake Storage (ADLS) architecture and best practices; familiarity with Azure Data Factory (ADF), Azure Synapse Analytics, and related services. β€’ Advanced programming skills in Python and SQL, plus hands-on Java experience (e.g., integrations/services or Spark/platform utilities). β€’ Cloud fundamentals in at least one major platform (Azure/AWS/Google Cloud Platform (GCP)): identity and access management (IAM), storage, networking basics, and cost controls. β€’ Working experience in Java (e.g., building integrations/services, Spark/streaming components, or platform utilities).