SGI

Databricks Spark SME

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Databricks Spark SME in Houston, TX, on a 12-month contract. Key skills include Apache Spark expertise, software engineering background, and experience with large transactional datasets and real-time streaming. Hands-on Databricks and AWS experience required.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
March 13, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Houston, TX
-
🧠 - Skills detailed
#Spark (Apache Spark) #Apache Spark #Hadoop #AWS (Amazon Web Services) #Databricks #Data Engineering #Java #Kafka (Apache Kafka) #Scala #Data Processing #Data Pipeline #Datasets
Role description
Databricks Spark SME (Transactional Data) Hybrid – Houston, TX 12-Month Contract We are supporting a client who is looking for a Databricks Spark SME to join their data engineering team on a 12-month contract. This role will focus on optimizing Spark workloads processing large-scale transactional datasets and improving performance, latency, and cost efficiency across the platform.' This is a hands-on engineering role, ideal for someone with a strong software engineering background and deep expertise in Spark internals and distributed data processing. Responsibilities • Act as the Spark subject matter expert for performance optimization across the Databricks platform • Analyze and optimize Spark jobs, clusters, and query performance • Troubleshoot and resolve latency issues in real-time streaming pipelines • Optimize cost and compute performance across Databricks workloads • Improve processing efficiency for large-scale transactional data pipelines • Work closely with data engineering teams to design high-performance Spark-based data processing frameworks • Optimize serverless and distributed compute workloads for large data processing • Collaborate with stakeholders across engineering and platform teams to ensure scalable and efficient solutions Requirements • Strong software engineering background (Java, Hadoop, or similar) • Deep expertise with Apache Spark, including Spark internals and performance tuning • Hands-on experience with the Databricks platform • Experience working with large transactional datasets • Strong experience with real-time or near real-time streaming pipelines (e.g., Kafka, Spark Structured Streaming) • Experience resolving performance, latency, and scaling challenges in distributed data systems • Hands-on experience working in AWS environments • Experience optimizing large-scale data processing workloads and compute costs • Experience with serverless compute on Databricks Nice to Have • Experience supporting high-throughput transactional systems • Experience in high-scale data environments