Databricks and AWS Focused Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Databricks and AWS Focused Data Engineer, onsite in Columbus, OH, for a 3+ month contract. Requires 6-9 years of experience in Databricks, PySpark, AWS, Terraform, and medallion architecture. Certifications preferred.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
-
πŸ—“οΈ - Date discovered
September 30, 2025
πŸ•’ - Project duration
More than 6 months
-
🏝️ - Location type
On-site
-
πŸ“„ - Contract type
Unknown
-
πŸ”’ - Security clearance
Unknown
-
πŸ“ - Location detailed
Columbus, OH
-
🧠 - Skills detailed
#Datasets #Security #Terraform #Infrastructure as Code (IaC) #Monitoring #Batch #Data Quality #Data Processing #AWS (Amazon Web Services) #Collibra #IAM (Identity and Access Management) #Data Pipeline #Data Lineage #S3 (Amazon Simple Storage Service) #GitHub #Datadog #Pytest #Scala #Alation #Lambda (AWS Lambda) #Databricks #Data Engineering #Kafka (Apache Kafka) #PySpark #Logging #Spark (Apache Spark) #AWS S3 (Amazon Simple Storage Service) #Documentation #Scrum #Data Catalog #AI (Artificial Intelligence) #ML (Machine Learning) #Compliance #Delta Lake #Agile #GIT #Airflow #Cloud
Role description
Job Summary (List Format): Databricks and AWS Data Engineer (Contract) β€’ Position: Data Engineer (Databricks and AWS Focus) β€’ Location: Onsite, Columbus, OH β€’ Duration: 3+ Months (Contract) Core Responsibilities β€’ Develop and maintain scalable data pipelines using PySpark/Spark on Databricks. β€’ Implement medallion architecture (raw, trusted, refined layers) for data processing. β€’ Integrate streaming (Kafka) and batch data sources, including APIs. β€’ Model/register datasets in enterprise data catalogs to ensure governance and accessibility. β€’ Manage secure, role-based access controls for analytics, AI, and ML use cases. β€’ Collaborate with team members to deliver high-quality, well-tested code. β€’ Optimize and operationalize Spark jobs and Delta Lake performance on AWS. β€’ Implement data quality checks, validations, and CI/CD for Databricks workflows. β€’ Provision/manage Databricks and AWS resources using Terraform (IaC). β€’ Set up monitoring/logging/alerts (CloudWatch, Datadog, Databricks audit logs). β€’ Produce technical documentation, runbooks, and data lineage. Required Skills & Qualifications β€’ 6-9 years of expert-level Databricks experience. β€’ 6-9 years of advanced hands-on PySpark/Spark experience. β€’ 6-9 years with AWS, S3, and Terraform (IaC). β€’ Strong knowledge of medallion architecture and data warehousing best practices. β€’ Experience building, optimizing, and governing enterprise data pipelines. β€’ Expertise in Delta Lake internals, time travel, schema enforcement, and Unity Catalog RBAC/ABAC. β€’ Hands-on experience with Spark Structured Streaming, Kafka, and late-arriving data handling. β€’ Familiarity with Git-based workflows and CI/CD (Databricks Repos, dbx, GitHub Actions, etc.). β€’ Experience with security/compliance: IAM, encryption, secrets management, PII governance. β€’ Proven ability to tune Spark jobs and optimize Databricks/AWS usage for performance and cost. β€’ Experience working in Agile/Scrum teams and code review processes. Preferred Skills & Qualifications β€’ Certifications: Databricks Data Engineer Professional, AWS Solutions Architect/Developer, Terraform Associate. β€’ Experience with enterprise data catalogs (Collibra, Alation) and data lineage tools (OpenLineage). β€’ Experience with orchestration tools: Databricks Workflows, Airflow. β€’ Additional AWS services: Glue, Lambda, Step Functions, CloudWatch, Secrets Manager. β€’ Experience with testing frameworks: pytest, chispa, Great Expectations, dbx test. β€’ Background in analytics/ML pipelines and MLOps integrations. Note: Must submit date of birth, full resume, and full updated LinkedIn profile.