Data Engineer - Databricks

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is a Data Engineer - Databricks for a 6-month contract in Woodlands, TX, offering competitive pay. Requires 5+ years in data engineering, proficiency in Databricks, Apache Spark, Python, SQL, and AWS services, plus experience in AI/ML workflows.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

🗓️ - Date discovered

July 29, 2025

🕒 - Project duration

Unknown

🏝️ - Location type

On-site

📄 - Contract type

Unknown

🔒 - Security clearance

Unknown

📍 - Location detailed

The Woodlands, TX

🧠 - Skills detailed

#Data Science #MLflow #Delta Lake #DevOps #Data Pipeline #AI (Artificial Intelligence) #Computer Science #S3 (Amazon Simple Storage Service) #Azure DevOps #Data Quality #Agile #Datasets #Kafka (Apache Kafka) #Big Data #Data Lake #Business Analysis #ML (Machine Learning) #AWS (Amazon Web Services) #Jenkins #Azure #Python #Data Governance #Apache Spark #Deployment #Redshift #Spark (Apache Spark) #Scala #GDPR (General Data Protection Regulation) #GitHub #Databricks #"ETL (Extract #Transform #Load)" #IAM (Identity and Access Management) #Data Engineering #Compliance #Cloud #Model Deployment #PySpark #Terraform #Documentation #Data Lakehouse #Lambda (AWS Lambda) #SQL (Structured Query Language)

Role description

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

Item 1
Item 2
Item 3

Unordered list

Item A
Item B
Item C

Text link

Bold text

Emphasis

^Superscript

_Subscript

CMK Resources is seeking a highly skilled and experienced Data Engineer to support one of our key customers in the waste management industry. The ideal candidate will have a strong background in building scalable data pipelines, optimizing big data workflows, and integrating Databricks with AWS services. This role will play a pivotal part in enabling the customer’s data engineering and analytics initiatives—especially those tied to AI-driven solutions and projects—by implementing cloud-native architectures that fuel innovation and sustainability. • • Relocation assistance is available for candidates interested in joining the team onsite at our Woodlands, TX location. • • Key Responsibilities: • Partner directly with the customer’s data engineering team to design and deliver scalable, cloud-based data solutions. • Execute complex ad-hoc queries using Databricks SQL to explore large lakehouse datasets and uncover actionable insights. • Leverage Databricks notebooks to develop robust data transformation workflows using PySpark and SQL. • Design, develop, and maintain scalable data pipelines using Apache Spark on Databricks. • Build ETL/ELT workflows with AWS services such as S3, Glue, Lambda, Redshift, and EMR. • Optimize Spark jobs for both performance and cost within the customer’s cloud infrastructure. • Collaborate with data scientists, ML engineers, and business analysts to support AI and machine learning use cases, including data preparation, feature engineering, and model operationalization. • Contribute to the development of AI-powered solutions that improve operational efficiency, route optimization, and predictive maintenance in the waste management domain. • Implement CI/CD pipelines for Databricks jobs using GitHub Actions, Azure DevOps, or Jenkins. • Ensure data quality, lineage, and compliance through tools like Unity Catalog, Delta Lake, and AWS Lake Formation. • Troubleshoot and maintain production data pipelines. • Provide mentorship and share best practices with both internal and customer teams. Required Qualifications: • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field. • 5+ years of experience in software/data engineering, including at least 2 years working with Databricks and Apache Spark. • Strong proficiency in Python, SQL, and PySpark. • Deep understanding of AWS cloud services (S3, Glue, Lambda, IAM, Redshift, CloudWatch). • Experience with Delta Lake, Databricks Workflows, and Databricks SQL. • Solid grasp of data lakehouse and warehousing architectures. • Prior experience supporting AI/ML workflows, including training data pipelines and model deployment support. • Familiarity with infrastructure-as-code tools like Terraform or CloudFormation. • Strong analytical and troubleshooting skills in a fast-paced, agile environment. Preferred Qualifications: • Databricks Certified Data Engineer or AWS Certified Solutions Architect. • Experience with real-time data using Kafka, Kinesis, or Structured Streaming. • Familiarity with MLflow, feature stores, or MLOps practices. • Experience delivering AI/ML projects in industrial or operational contexts (e.g., waste management, logistics, energy). • Knowledge of data governance and compliance (e.g., GDPR, HIPAA). Soft Skills: • Excellent collaboration skills for interfacing with both technical and non-technical customer stakeholders. • Clear communicator with strong documentation habits. • Comfortable leading discussions, offering strategic input, and mentoring others. Direct Applicants Only – No Staffing Agencies or Third-Party Recruiters We are not accepting solicitations from staffing agencies, recruiting firms, or third-party vendors for this position. Any unsolicited resumes or candidate submissions from such entities will not be considered, and we will not be responsible for any associated fees. Thank you for respecting this policy.

Apply now Apply with DFH Sign up

← See all roles