

E-IT
Databricks Data Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Databricks Data Engineer with strong DevOps expertise, offering a hybrid contract in Los Angeles, CA or New York, NY. Key skills include PySpark, SQL, AWS, and experience with ETL/ELT pipelines. Duration exceeds 6 months.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
Unknown
-
ποΈ - Date
June 6, 2026
π - Duration
More than 6 months
-
ποΈ - Location
Hybrid
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
Los Angeles, CA
-
π§ - Skills detailed
#Scala #Delta Lake #Security #Data Governance #Datasets #PySpark #Compliance #Databases #Cloud #API (Application Programming Interface) #Lambda (AWS Lambda) #Data Engineering #S3 (Amazon Simple Storage Service) #Observability #Data Pipeline #Triggers #IP (Internet Protocol) #"ETL (Extract #Transform #Load)" #Logging #DevOps #BI (Business Intelligence) #Libraries #"ACID (Atomicity #Consistency #Isolation #Durability)" #Monitoring #GIT #Spark SQL #Terraform #GitLab #Data Analysis #Databricks #GitHub #Version Control #Deployment #Data Processing #Data Warehouse #AWS (Amazon Web Services) #Storage #SQL (Structured Query Language) #Spark (Apache Spark)
Role description
Job Title: Databricks Data Engineer
Location : Los Angeles CA or New York, NY (Hybrid)
Contract / Fulltime
Job Summar
yWe are looking for an experienced Databricks Data Engineer with strong DevOps expertise to join our data engineering team. The ideal candidate will design, build, and optimize large-scale pipelines on the Databricks Lakehouse Platform on AWS, while driving automated CI/CD and deployment practices. This role requires strong skills in PySpark, SQL, AWS cloud services, and modern DevOps tooling. You will collaborate closely with cross-functional teams to deliver scalable, secure, and high-performance data solutions
.
Must Demonstrate (Critical Skills & Architectural Competencie
β’ s)Designing and implementing Databricks-based Lakehouse architectures on A
β’ WSClear separation of compute vs. serving laye
β’ rsAbility to design low-latency data/API access strategies (beyond Spark-only pattern
β’ s)Strong understanding of caching strategies for performance and cost optimizati
β’ onData partitioning, storage optimization, and file layout strate
β’ gyAbility to handle multi-terabyte structured or time-series datase
β’ tsSkill in requirement probing, identifying what matters architectural
β’ lyA player-coach mindset: hands-on engineering + technical leadersh
ip
Key Responsibilit
ies1. Data Pipeline Developm
β’ entDesign, build, and maintain scalable ETL/ELT pipelines using Databricks on A
β’ WS.Develop high-performance data processing workflows using PySpark/Spark and S
β’ QL.Integrate data from Amazon S3, relational databases, and semi/nonβstructured sourc
β’ es.Implement Delta Lake best practices including schema evolution, ACID, OPTIMIZE, ZORDER, partitioning, and file-size tuni
β’ ng.Ensure architectures support high-volume, multi-terabyte workloa
ds.
1. DevOps & C
β’ I/CDImplement CI/CD pipelines for Databricks using Git, GitLab, GitHub Actions, or AWS-native to
β’ ols.Build and manage automated deployments using Databricks Asset Bund
β’ les.Manage version control for notebooks, workflows, libraries, and environment configurat
β’ ion.Automate cluster policies, job creation, environment provisioning, and configuration managem
β’ ent.Support infrastructure-as-code via Terraform (preferred) or CloudFormat
ion.
1. Collaboration & Business Su
β’ pportWork with data analysts and BI teams to prepare curated datasets for reporting and analy
β’ tics.Collaborate closely with product owners, engineering teams, and business partners to translate requirements into scalable implementat
β’ ions.Document data flows, technical architecture, and DevOps/deployment workf
lows.
1. Performance & Optimi
β’ zationTune Spark clusters, workflows, and queries for cost efficiency and compute perfor
β’ mance.Monitor pipelines, troubleshoot failures, and maintain high reliab
β’ ility.Implement logging, monitoring, and observability across workflows and
β’ jobs.Apply caching strategies and workload optimization techniques to support low-latency consumption pat
terns.
1. Governance & S
β’ ecurityImplement and maintain data governance using Unity C
β’ atalog.Enforce access controls, security policies, and data compliance requir
β’ ements.Ensure lineage, quality checks, and auditability across data
flows.
Technica
β’ l SkillsStrong hands-on experience with Databricks, in
β’ cluding:De
β’ lta LakeUnity
β’ CatalogLakehouse Arch
β’ itectureDelta Live P
β’ ipelinesDatabricks
β’ RuntimeTable
β’ TriggersDatabricks W
β’ orkflowsProficiency in PySpark, Spark, and advan
β’ ced SQL.Expertise with AWS cloud services, in
β’ cl
β’ udi
β’ ng:S3IAMGlue / Glue
β’ Catal
β’ ogLambdaKinesis (optional but ben
β’ eficial)Secrets
β’ ManagerStrong understanding of DevOp
β’ s tools:Git
β’ / GitLabCI/CD p
β’ ipelinesDatabricks Asset
β’ BundlesFamiliarity with Terraform is
β’ a plus.Experience with relational databases and data warehouse c
oncepts.
Preferred E
β’ xperienceKnowledge of streaming technologies like Structured Streaming/Spark S
β’ treaming.Experience building real-time or near real-time p
β’ ipelines.Exposure to advanced Databricks runtime configurations and performanc
e tuning.
Certifications
β’ (Optional)Databricks Certified Data Engineer Associate / Pr
β’ ofessionalAWS Data Engineer or AWS Solutions Architect cer
tification
Job Title: Databricks Data Engineer
Location : Los Angeles CA or New York, NY (Hybrid)
Contract / Fulltime
Job Summar
yWe are looking for an experienced Databricks Data Engineer with strong DevOps expertise to join our data engineering team. The ideal candidate will design, build, and optimize large-scale pipelines on the Databricks Lakehouse Platform on AWS, while driving automated CI/CD and deployment practices. This role requires strong skills in PySpark, SQL, AWS cloud services, and modern DevOps tooling. You will collaborate closely with cross-functional teams to deliver scalable, secure, and high-performance data solutions
.
Must Demonstrate (Critical Skills & Architectural Competencie
β’ s)Designing and implementing Databricks-based Lakehouse architectures on A
β’ WSClear separation of compute vs. serving laye
β’ rsAbility to design low-latency data/API access strategies (beyond Spark-only pattern
β’ s)Strong understanding of caching strategies for performance and cost optimizati
β’ onData partitioning, storage optimization, and file layout strate
β’ gyAbility to handle multi-terabyte structured or time-series datase
β’ tsSkill in requirement probing, identifying what matters architectural
β’ lyA player-coach mindset: hands-on engineering + technical leadersh
ip
Key Responsibilit
ies1. Data Pipeline Developm
β’ entDesign, build, and maintain scalable ETL/ELT pipelines using Databricks on A
β’ WS.Develop high-performance data processing workflows using PySpark/Spark and S
β’ QL.Integrate data from Amazon S3, relational databases, and semi/nonβstructured sourc
β’ es.Implement Delta Lake best practices including schema evolution, ACID, OPTIMIZE, ZORDER, partitioning, and file-size tuni
β’ ng.Ensure architectures support high-volume, multi-terabyte workloa
ds.
1. DevOps & C
β’ I/CDImplement CI/CD pipelines for Databricks using Git, GitLab, GitHub Actions, or AWS-native to
β’ ols.Build and manage automated deployments using Databricks Asset Bund
β’ les.Manage version control for notebooks, workflows, libraries, and environment configurat
β’ ion.Automate cluster policies, job creation, environment provisioning, and configuration managem
β’ ent.Support infrastructure-as-code via Terraform (preferred) or CloudFormat
ion.
1. Collaboration & Business Su
β’ pportWork with data analysts and BI teams to prepare curated datasets for reporting and analy
β’ tics.Collaborate closely with product owners, engineering teams, and business partners to translate requirements into scalable implementat
β’ ions.Document data flows, technical architecture, and DevOps/deployment workf
lows.
1. Performance & Optimi
β’ zationTune Spark clusters, workflows, and queries for cost efficiency and compute perfor
β’ mance.Monitor pipelines, troubleshoot failures, and maintain high reliab
β’ ility.Implement logging, monitoring, and observability across workflows and
β’ jobs.Apply caching strategies and workload optimization techniques to support low-latency consumption pat
terns.
1. Governance & S
β’ ecurityImplement and maintain data governance using Unity C
β’ atalog.Enforce access controls, security policies, and data compliance requir
β’ ements.Ensure lineage, quality checks, and auditability across data
flows.
Technica
β’ l SkillsStrong hands-on experience with Databricks, in
β’ cluding:De
β’ lta LakeUnity
β’ CatalogLakehouse Arch
β’ itectureDelta Live P
β’ ipelinesDatabricks
β’ RuntimeTable
β’ TriggersDatabricks W
β’ orkflowsProficiency in PySpark, Spark, and advan
β’ ced SQL.Expertise with AWS cloud services, in
β’ cl
β’ udi
β’ ng:S3IAMGlue / Glue
β’ Catal
β’ ogLambdaKinesis (optional but ben
β’ eficial)Secrets
β’ ManagerStrong understanding of DevOp
β’ s tools:Git
β’ / GitLabCI/CD p
β’ ipelinesDatabricks Asset
β’ BundlesFamiliarity with Terraform is
β’ a plus.Experience with relational databases and data warehouse c
oncepts.
Preferred E
β’ xperienceKnowledge of streaming technologies like Structured Streaming/Spark S
β’ treaming.Experience building real-time or near real-time p
β’ ipelines.Exposure to advanced Databricks runtime configurations and performanc
e tuning.
Certifications
β’ (Optional)Databricks Certified Data Engineer Associate / Pr
β’ ofessionalAWS Data Engineer or AWS Solutions Architect cer
tification






