

Zeektek
Data Engineer II - Spark - Python - SQL, DataBricks Hybrid St Louis/Remote Contract to Permanent Job
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is a Data Engineer II position focused on Spark, Python, SQL, and Databricks, requiring 3+ years in each skill and 5+ years in software engineering. It is a hybrid contract-to-permanent role based in St. Louis, MO.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
January 9, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Greater St. Louis
-
🧠 - Skills detailed
#Batch #GitHub #Databricks #Data Science #GIT #Spark (Apache Spark) #Unit Testing #Integration Testing #Data Quality #PySpark #SQL (Structured Query Language) #Cloud #Python #ML (Machine Learning) #Code Reviews #Scala #Apache Spark #Data Pipeline #"ETL (Extract #Transform #Load)" #Data Engineering #Data Transformations
Role description
We have a 6 month contract to hire for a skilled Data Engineer to design, build, and optimize scalable data pipelines and analytics solutions in a modern cloud-based environment. This role is highly hands-on and will focus on developing reliable, performant data workflows using Spark, Python, SQL, and Databricks, while following strong software engineering best practices.
You’ll collaborate closely with data scientists, analytics teams, and platform engineers to deliver high-quality data solutions that support reporting, analytics, and machine learning initiatives.
• Must be in St. Louis MO. Hybrid role
• Onsite required Hybrid
• MUST HAVES:
• 3 or more years of experience in Spark
• 3 or more years of experience in Python
• 3 or more years of experience in SQL
• 5 or more years of experience in software engineering (unit tests, integration tests, GitHub, dependency management, CI/CD)
• 3 or more years of experience in Databricks
• Preferred Experience:
• 3 or more years of experience with PySpark
• Git and MLOps best practices
• About this Role:
• Focused on foundational work supporting the data solutions initiatives
• Key Responsibilities:
• Design, develop, and maintain scalable data pipelines using Apache Spark and Databricks
• Build and optimize data transformations using Python, PySpark, and SQL
• Ensure data quality, reliability, and performance across batch and streaming workloads
• Apply strong software engineering best practices, including unit testing, integration testing, and code reviews
• Manage source control using GitHub and participate in CI/CD workflows
• Collaborate with cross-functional teams to support analytics and ML use cases
• Troubleshoot and resolve data pipeline and performance issues
We have a 6 month contract to hire for a skilled Data Engineer to design, build, and optimize scalable data pipelines and analytics solutions in a modern cloud-based environment. This role is highly hands-on and will focus on developing reliable, performant data workflows using Spark, Python, SQL, and Databricks, while following strong software engineering best practices.
You’ll collaborate closely with data scientists, analytics teams, and platform engineers to deliver high-quality data solutions that support reporting, analytics, and machine learning initiatives.
• Must be in St. Louis MO. Hybrid role
• Onsite required Hybrid
• MUST HAVES:
• 3 or more years of experience in Spark
• 3 or more years of experience in Python
• 3 or more years of experience in SQL
• 5 or more years of experience in software engineering (unit tests, integration tests, GitHub, dependency management, CI/CD)
• 3 or more years of experience in Databricks
• Preferred Experience:
• 3 or more years of experience with PySpark
• Git and MLOps best practices
• About this Role:
• Focused on foundational work supporting the data solutions initiatives
• Key Responsibilities:
• Design, develop, and maintain scalable data pipelines using Apache Spark and Databricks
• Build and optimize data transformations using Python, PySpark, and SQL
• Ensure data quality, reliability, and performance across batch and streaming workloads
• Apply strong software engineering best practices, including unit testing, integration testing, and code reviews
• Manage source control using GitHub and participate in CI/CD workflows
• Collaborate with cross-functional teams to support analytics and ML use cases
• Troubleshoot and resolve data pipeline and performance issues





