Soho Square Solutions

Senior Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Engineer, offering a contract of unspecified length with a pay rate of "unknown". The position requires 7+ years of data engineering experience, expertise in Databricks and Medallion Architecture, and familiarity with enterprise systems like SAP and Salesforce.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
680
-
πŸ—“οΈ - Date
June 3, 2026
πŸ•’ - Duration
Unknown
-
🏝️ - Location
Unknown
-
πŸ“„ - Contract
Unknown
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
United States
-
🧠 - Skills detailed
#AWS (Amazon Web Services) #Data Lineage #Security #Databricks #SQL (Structured Query Language) #Spark (Apache Spark) #PySpark #Compliance #Metadata #"ETL (Extract #Transform #Load)" #Monitoring #SAP #Python #Data Pipeline #Scala #Data Quality #Azure #Cloud #Datasets #Data Engineering #Libraries
Role description
We are seeking a Senior Data Engineer with deep expertise in the Databricks ecosystem and Medallion Architecture to lead a critical regulatory inspection readiness data initiative. In this role, you will own the end-to-end design and implementation of scalable data pipelines built to ingest, parse, and transform vast volumes of unstructured quality and operational documents (PDFs, Word files, images, Excel sheets) into business-ready, structured Gold-layer datasets. The ideal candidate brings a proven track record of handling unstructured data pipelines natively within Databricks and has experience operating within enterprise environments utilizing SAP, Salesforce, or TrackWise. Core Responsibilities: β€’ Pipeline Architecture: Architect, build, and maintain production-grade data pipelines utilizing Databricks and Medallion Architecture (Bronze -> Silver -> Gold). β€’ Unstructured Data Engineering: Design robust frameworks to ingest and transform unstructured data formats (PDFs, images, Word docs, text logs) from enterprise source systems into structured, query-ready Gold-layer assets. β€’ Regulatory Data Curation: Partner with Quality and Compliance teams to model data specifically optimized for rapid audit retrieval and regulatory inspection readiness. β€’ Framework Development: Build reusable data quality validation frameworks, monitoring rules, and error-handling mechanisms across all pipeline stages. β€’ Platform Governance: Leverage Databricks features (Unity Catalog, Workflows) to ensure data lineage, security compliance, and access control across dev and prod workspaces. Qualifications- Required: β€’ 7+ years of hands-on data engineering experience, with a heavy focus on Python, SQL, and PySpark. β€’ 3+ years of production experience designing and deploying Medallion Architecture frameworks natively inside Databricks. β€’ Demonstrated, real-world experience building extraction and parsing pipelines for unstructured data (extracting text/metadata from PDFs, images, docs). β€’ Proven ability to build highly reliable data transformation frameworks from the ground up. Preferred: β€’ Technical familiarity or integration experience with enterprise systems: TrackWise, SAP, and Salesforce. β€’ Experience working within highly regulated industries (Life Sciences, Pharma, Biotech, or Medical Devices) under GxP or strict compliance standards. β€’ Experience with document parsing libraries or cloud OCR tools (e.g., Azure Document Intelligence, AWS Textract, Unstructured.io).