

Allwyn Corporation
Databricks Engineer Lead (AWS)
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Databricks Engineer Lead (AWS) on a contract basis, requiring a Databricks Certified Data Engineer Professional. Key skills include Databricks, AWS services, Python, PySpark, and experience with data pipeline implementation.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
April 22, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Metadata #AWS Kinesis #Storage #Data Management #Batch #BI (Business Intelligence) #Data Ingestion #"ETL (Extract #Transform #Load)" #PySpark #Data Lineage #Lambda (AWS Lambda) #Leadership #SQL (Structured Query Language) #Scala #Delta Lake #Cloud #Data Science #S3 (Amazon Simple Storage Service) #Data Pipeline #Data Engineering #Kafka (Apache Kafka) #Spark SQL #Redshift #Databricks #AWS (Amazon Web Services) #Strategy #Security #Data Strategy #DevOps #Python #GIT #AI (Artificial Intelligence) #Automation #ML (Machine Learning) #Data Processing #Compliance #Data Lake #Data Lakehouse #Data Storage #Spark (Apache Spark) #Data Quality #Datasets
Role description
Databricks Lead Engineer (AWS)
Databricks Certified Data Engineer Professional certification is mandatory
We are looking for a hands-on Databricks Engineer with strong AWS experience to design, build, and optimize scalable data pipelines and lakehouse solutions. The role focuses on implementing robust batch and streaming data solutions using Databricks, Delta Lake, and AWS cloud-native services, ensuring high performance, scalability, and security.
Key Responsibilities
• Build and maintain end-to-end data pipelines using Databricks, Delta Lake, and AWS services
• Develop batch, real-time, and streaming data processing workflows
• Implement data ingestion, transformation, curation, and storage pipelines
• Build and optimize large-scale PySpark and SQL-based jobs in Databricks
• Enable real-time data processing using Kafka, AWS Kinesis, or similar streaming tools
Data Lakehouse Implementation
• Work on Databricks-based lakehouse architecture using Delta Lake
• Implement scalable and optimized data storage and processing frameworks
• Ensure data quality, consistency, and reliability across pipelines
• Support metadata management, data lineage, and governance implementation
Cloud & Platform Engineering (AWS)
• Work with AWS services such as S3, Glue, Lambda, Kinesis, and Redshift
• Ensure pipelines are scalable, secure, and cost-optimized in AWS environments
• Implement security controls including RBAC, encryption, and data masking
Optimization & Best Practices
• Tune Spark jobs for performance and cost efficiency
• Monitor and troubleshoot data pipeline issues in production
• Follow CI/CD and DevOps practices for deploying data engineering solutions
• Ensure adherence to data engineering standards and best practices
Collaboration
• Work closely with BI teams, and business stakeholders
• Support analytics and AI/ML data requirements through curated datasets
• Collaborate with architects to ensure alignment with AWS-based data strategy
Technical Leadership & Architecture
• Lead the design and implementation of scalable, end‑to‑end data pipelines using Databricks, Delta Lake, and AWS services.
• Architect and oversee batch, real‑time, and streaming data processing frameworks.
• Drive the adoption of lakehouse best practices, ensuring robust data quality, governance, and lineage.
• Provide technical direction on PySpark, SQL, and distributed data processing optimization.
• Guide the team in implementing secure, cost‑efficient, and high‑performance data solutions on AWS.
Team & Project Leadership
• Mentor and coach data engineers, fostering skill development and high‑quality engineering practices.
• Lead sprint planning, workload distribution, and delivery oversight for data engineering initiatives.
• Establish coding standards, review pull requests, and ensure adherence to CI/CD and DevOps practices.
• Collaborate with architects to align team deliverables with enterprise data strategy and platform roadmap.
• Drive continuous improvement across the team through automation, tooling enhancements, and process refinement.
Cross-Functional Collaboration
• Partner with BI, analytics, and data science teams to deliver curated, high‑quality datasets for reporting and ML use cases.
• Work closely with business stakeholders to translate requirements into scalable data engineering solutions.
• Coordinate with cloud, security, and platform teams to ensure compliance, governance, and operational excellence.
Required Skills & Qualifications:
• Strong hands-on experience with Databricks.
• Proficiency in Python, PySpark, and SQL
• Strong experience in AWS cloud services (S3, Glue, Lambda, Kinesis, Redshift)
• Experience building ETL/ELT data pipelines
• Strong understanding of Delta Lake and lakehouse concepts
• Experience with streaming and batch data processing
• Knowledge of CI/CD tools and Git
• Strong troubleshooting and performance tuning skills
Databricks Lead Engineer (AWS)
Databricks Certified Data Engineer Professional certification is mandatory
We are looking for a hands-on Databricks Engineer with strong AWS experience to design, build, and optimize scalable data pipelines and lakehouse solutions. The role focuses on implementing robust batch and streaming data solutions using Databricks, Delta Lake, and AWS cloud-native services, ensuring high performance, scalability, and security.
Key Responsibilities
• Build and maintain end-to-end data pipelines using Databricks, Delta Lake, and AWS services
• Develop batch, real-time, and streaming data processing workflows
• Implement data ingestion, transformation, curation, and storage pipelines
• Build and optimize large-scale PySpark and SQL-based jobs in Databricks
• Enable real-time data processing using Kafka, AWS Kinesis, or similar streaming tools
Data Lakehouse Implementation
• Work on Databricks-based lakehouse architecture using Delta Lake
• Implement scalable and optimized data storage and processing frameworks
• Ensure data quality, consistency, and reliability across pipelines
• Support metadata management, data lineage, and governance implementation
Cloud & Platform Engineering (AWS)
• Work with AWS services such as S3, Glue, Lambda, Kinesis, and Redshift
• Ensure pipelines are scalable, secure, and cost-optimized in AWS environments
• Implement security controls including RBAC, encryption, and data masking
Optimization & Best Practices
• Tune Spark jobs for performance and cost efficiency
• Monitor and troubleshoot data pipeline issues in production
• Follow CI/CD and DevOps practices for deploying data engineering solutions
• Ensure adherence to data engineering standards and best practices
Collaboration
• Work closely with BI teams, and business stakeholders
• Support analytics and AI/ML data requirements through curated datasets
• Collaborate with architects to ensure alignment with AWS-based data strategy
Technical Leadership & Architecture
• Lead the design and implementation of scalable, end‑to‑end data pipelines using Databricks, Delta Lake, and AWS services.
• Architect and oversee batch, real‑time, and streaming data processing frameworks.
• Drive the adoption of lakehouse best practices, ensuring robust data quality, governance, and lineage.
• Provide technical direction on PySpark, SQL, and distributed data processing optimization.
• Guide the team in implementing secure, cost‑efficient, and high‑performance data solutions on AWS.
Team & Project Leadership
• Mentor and coach data engineers, fostering skill development and high‑quality engineering practices.
• Lead sprint planning, workload distribution, and delivery oversight for data engineering initiatives.
• Establish coding standards, review pull requests, and ensure adherence to CI/CD and DevOps practices.
• Collaborate with architects to align team deliverables with enterprise data strategy and platform roadmap.
• Drive continuous improvement across the team through automation, tooling enhancements, and process refinement.
Cross-Functional Collaboration
• Partner with BI, analytics, and data science teams to deliver curated, high‑quality datasets for reporting and ML use cases.
• Work closely with business stakeholders to translate requirements into scalable data engineering solutions.
• Coordinate with cloud, security, and platform teams to ensure compliance, governance, and operational excellence.
Required Skills & Qualifications:
• Strong hands-on experience with Databricks.
• Proficiency in Python, PySpark, and SQL
• Strong experience in AWS cloud services (S3, Glue, Lambda, Kinesis, Redshift)
• Experience building ETL/ELT data pipelines
• Strong understanding of Delta Lake and lakehouse concepts
• Experience with streaming and batch data processing
• Knowledge of CI/CD tools and Git
• Strong troubleshooting and performance tuning skills






