

Zeektek
Databrick Engineer - Python, Pyspark, Apache Spark, SQL
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Databrick Engineer with a 6-9 month remote contract, offering a pay rate of "TBD." Candidates must have 3+ years in Spark, Python, SQL, and Databricks, along with a healthcare data background and strong software engineering skills.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
720
-
🗓️ - Date
February 4, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#GIT #Apache Spark #Python #Data Quality #Spark SQL #Databricks #Unit Testing #Data Pipeline #Data Engineering #GitHub #Code Reviews #SQL (Structured Query Language) #ML (Machine Learning) #Spark (Apache Spark) #"ETL (Extract #Transform #Load)" #Data Transformations #Automation #PySpark #Scala #Batch #Integration Testing #Cloud #Data Science
Role description
We have a 6 - 9 month contract, with potential for contract to hire for a skilled Data Engineer to design, build, and optimize scalable data pipelines and automation for analytics solutions in a modern cloud-based environment. This role is highly hands-on and will focus on developing reliable, performant data workflows using Spark, Python, SQL, and Databricks, while following strong software engineering best practices.
You’ll collaborate closely with data scientists, analytics teams, and platform engineers to deliver high-quality data solutions that support reporting, analytics, and machine learning initiatives.
The position is remote
• MUST HAVES:
• 3 or more years of experience in Spark
• 3 or more years of experience in Python
• 3 or more years of experience in SQL
• 5 or more years of experience in software engineering (unit tests, integration tests, GitHub, dependency management, CI/CD)
• 3 or more years of experience in Databricks
• Healthcare Data background
• Automation
• Preferred Experience:3 or more years of experience with PySpark
• Git and MLOps best practices
• About this Role: Focused on foundational work supporting the data solutions initiatives
• Key Responsibilities: Design, develop, and maintain scalable data pipelines using Apache Spark and Databricks
• Build and optimize data transformations using Python, PySpark, and SQL
• Ensure data quality, reliability, and performance across batch and streaming workloads
• Apply strong software engineering best practices, including unit testing, integration testing, and code reviews
• Manage source control using GitHub and participate in CI/CD workflows
• Collaborate with cross-functional teams to support analytics and ML use cases
• Troubleshoot and resolve data pipeline and performance issues
We have a 6 - 9 month contract, with potential for contract to hire for a skilled Data Engineer to design, build, and optimize scalable data pipelines and automation for analytics solutions in a modern cloud-based environment. This role is highly hands-on and will focus on developing reliable, performant data workflows using Spark, Python, SQL, and Databricks, while following strong software engineering best practices.
You’ll collaborate closely with data scientists, analytics teams, and platform engineers to deliver high-quality data solutions that support reporting, analytics, and machine learning initiatives.
The position is remote
• MUST HAVES:
• 3 or more years of experience in Spark
• 3 or more years of experience in Python
• 3 or more years of experience in SQL
• 5 or more years of experience in software engineering (unit tests, integration tests, GitHub, dependency management, CI/CD)
• 3 or more years of experience in Databricks
• Healthcare Data background
• Automation
• Preferred Experience:3 or more years of experience with PySpark
• Git and MLOps best practices
• About this Role: Focused on foundational work supporting the data solutions initiatives
• Key Responsibilities: Design, develop, and maintain scalable data pipelines using Apache Spark and Databricks
• Build and optimize data transformations using Python, PySpark, and SQL
• Ensure data quality, reliability, and performance across batch and streaming workloads
• Apply strong software engineering best practices, including unit testing, integration testing, and code reviews
• Manage source control using GitHub and participate in CI/CD workflows
• Collaborate with cross-functional teams to support analytics and ML use cases
• Troubleshoot and resolve data pipeline and performance issues






