Ampstek

Senior Data Architect – Databricks Migration (MDM & Data Quality & DataFlux)

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Architect focused on Databricks Migration, with a contract length of "unknown" and a pay rate of "unknown." Key skills include DataFlux expertise, MDM experience, and proficiency in Databricks and PySpark. Remote work is available.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
May 13, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
New York, United States
-
🧠 - Skills detailed
#Collibra #Migration #Libraries #Fivetran #Data Modeling #Security #Spark SQL #Data Architecture #Data Vault #Alation #Spark (Apache Spark) #ADF (Azure Data Factory) #Scala #Vault #Data Management #dbt (data build tool) #Clustering #GDPR (General Data Protection Regulation) #GCP (Google Cloud Platform) #Data Governance #Databricks #Cloud #Monitoring #Metadata #Consulting #Strategy #Informatica #Leadership #AWS (Amazon Web Services) #Delta Lake #Azure #Base #SQL (Structured Query Language) #SAS #Data Quality #MDM (Master Data Management) #Data Stewardship #Data Integration #PySpark #Airflow #Data Engineering
Role description
Role: Senior Data Architect – Databricks Migration (MDM & Data Quality & DataFlux) Location: NYC or Remote Client: Virtusa / Excellus Health Plan Inc OverviewOur client is undertaking a strategic modernization initiative to migrate their enterprise data quality, MDM, anddata integration workloads from SAS DataFlux (dfPower Studio, Data Management Studio, and the DataFlux DataManagement Server) to the Databricks Lakehouse Platform. We are seeking a Senior Onshore DataFlux SolutionArchitect to lead the architectural strategy, target-state design, and migration blueprint for this multi-phaseprogram. This role is a hands-on, client-facing leadership position responsible for translating legacy DataFluxlogic, business rules, and MDM constructs into a modern, scalable Databricks-native architecture leveraging DeltaLake, Unity Catalog, and Delta Live Tables.Key Responsibilities • Lead end-to-end solution architecture for the DataFlux to Databricks migration, including current-stateassessment, gap analysis, target-state design, and migration roadmap. • Reverse-engineer and document existing DataFlux jobs, data services, business rules, QKBs (QualityKnowledge Bases), and MDM hub configurations to produce a complete logical inventory. • Design the target Databricks Lakehouse architecture (medallion: bronze/silver/gold) with Delta Lake, UnityCatalog governance, and Delta Live Tables pipelines that replicate or improve upon DataFlux DQ and MDMfunctionality. • Define the strategy for migrating standardization, parsing, matching, clustering, and survivorship logic fromDataFlux into Databricks-native patterns (PySpark, SQL, and partner tools such as Reltio, Informatica CDQ,or Zingg where appropriate). • Architect MDM target-state for party, product, location, and reference data domains; define golden recordlogic, hierarchy management, and stewardship workflows on the lakehouse. • Establish data quality frameworks (DQ rules, scorecards, exception handling) using Delta Live Tablesexpectations, Great Expectations, or Databricks Lakehouse Monitoring as DataFlux replacements. • Partner with the client's enterprise architecture, data governance, and security teams to align on UnityCatalog design, lineage, RBAC, and PII handling. • Provide technical leadership and mentorship to a blended onshore/offshore engineering team; conduct designreviews and enforce engineering standards. • Serve as the senior client-facing technical advisor — present architecture decisions, trade-offs, and migrationprogress to Director and VP-level stakeholders. • Own technical risk identification and mitigation across the migration lifecycle, including cutover strategy,parallel run validation, and decommissioning of DataFlux infrastructure.Required QualificationsDataFlux Expertise (Non-Negotiable) • 10+ years of enterprise data architecture experience, with a minimum of 5 years of hands-on experiencedesigning and deploying solutions on SAS DataFlux (dfPower Studio and/or Data Management Studio). • Deep working knowledge of DataFlux Data Management Server, Architect jobs, Profile jobs, data services,and the QKB (Quality Knowledge Base) — including authoring custom definitions, regex libraries, phonetics,and locale-specific rules. • Demonstrated experience with DataFlux match codes, clustering, entity resolution, and survivorship ruledesign at enterprise scale. • Proven ability to reverse-engineer complex, undocumented DataFlux job flows and translate them intomodern equivalents.Master Data Management (MDM) • Strong architectural experience across MDM domains — Customer/Party, Product, Location, Vendor,Employee, and Reference Data. • Hands-on experience with at least one enterprise MDM platform in addition to DataFlux: Informatica MDM,Reltio, Profisee, IBM InfoSphere MDM, or Stibo STEP. • Expertise in match/merge logic, golden record creation, hierarchy management, cross-reference (XREF)design, and data stewardship workflows.Databricks & Modern Data Stack • Production experience architecting solutions on Databricks, including Delta Lake, Unity Catalog, Delta LiveTables, Workflows, and the medallion architecture pattern. • Strong PySpark and Spark SQL skills; able to design performant patterns for large-scale matching,deduplication, and DQ workloads. • Working knowledge of cloud platforms (Azure, AWS, or GCP) and modern ingestion tools (Fivetran, ADF,Airflow, dbt).Data Domains & Governance • Broad fluency across data quality, data governance, data modeling (3NF, dimensional, Data Vault), andmetadata management. • Experience implementing data governance tooling (Collibra, Alation, Atlan, or Unity Catalog-nativegovernance). • Familiarity with regulatory and privacy frameworks (HIPAA, GDPR, CCPA, SOX) and their impact on MDMand DQ design.Preferred Qualifications • Prior experience leading at least one DataFlux modernization or sunset program. • Databricks certifications (Data Engineer Professional, Solutions Architect Professional). • Experience in healthcare payer, financial services, or insurance verticals. • Background in consulting or professional services — comfortable with SOW-driven delivery and billableutilization expectations. Contact: Snehil Mishra 📧 snehil@ampstek.com 📞 Desk: 609-360-2673 Ext. 125 🔗 LinkedIn 🌐 www.ampstek.com