Avalon Information Technologies L.L.C

Data Architect - Health Care/Life Sciences| GCP Β· Matillion Β· DBT - Remote Contract Position

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Architect in Health Care/Life Sciences, offering a remote contract position for 6-12 months at a pay rate of "$X". Requires 8-14 years of experience, expertise in GCP BigQuery, Matillion, and dbt, plus clinical data knowledge.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
Unknown
-
πŸ—“οΈ - Date
May 10, 2026
πŸ•’ - Duration
Unknown
-
🏝️ - Location
Unknown
-
πŸ“„ - Contract
Unknown
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
United States
-
🧠 - Skills detailed
#Automation #BigQuery #dbt (data build tool) #Matillion #Metadata #DevOps #Deployment #Clustering #Data Quality #Datasets #GIT #ADaM (Analysis Data Model) #Compliance #IAM (Identity and Access Management) #Collibra #SQL (Structured Query Language) #GCP (Google Cloud Platform) #CDISC (Clinical Data Interchange Standards Consortium) #VPC (Virtual Private Cloud) #Informatica #FHIR (Fast Healthcare Interoperability Resources) #Scala #ML (Machine Learning) #AI (Artificial Intelligence) #Security #Storage #Documentation #Data Engineering #GitLab #Python #Data Management #GitHub #Alation #Data Architecture #Terraform #Macros #Airflow #Cloud #Logging #Data Governance #MDM (Master Data Management) #Data Catalog #"ETL (Extract #Transform #Load)" #Automated Testing #Monitoring
Role description
Role Overview We are seeking an experienced Data Architect with deep, hands-on expertise in GCP BigQuery, Matillion ETL, and dbt (Core/Cloud). You will lead the design and delivery of cloud-native, scalable data platforms for clinical and life-sciences clients, establishing architecture patterns that span ingestion, transformation, governance, and analytics. This is a senior individual-contributor and technical-lead role with high visibility into client delivery and pre-sales solutioning. Core Technology Stack Cloud PlatformΒ : GCP Β· BigQuery Β· Cloud Storage Β· Cloud Composer (Airflow) Cloud Build ETL / Orchestration : Matillion ETL Β· Cloud Composer Β· GitLab CI / GitHub Actions Transformation & Modeling: dbt Core / Cloud Β· Staging Β· Intermediate Β· Marts Β· Macros Β· Tests Languages: SQL (Advanced) Β· Python Β· Jinja2 Clinical Standards: CDISC SDTM/ADaM Β· OMOP Β· HL7/FHIR Β· EHR/EMR Β· EDC/CTMS DevOps & Governance: Git Β· Code Review Β· CI/CD Β· Data Quality Β· Audit Logging Key Responsibilities GCP BigQuery Architecture β€’ Design multi-project BigQuery environments aligned to medallion architecture principles β€’ Optimize query performance using partitioning, clustering, and materialization strategies β€’ Enforce dataset-level IAM, column-level security, and VPC Service Controls for regulated data β€’ Integrate BigQuery with Cloud Storage, Pub/Sub, and Vertex AI for end-to-end analytics pipelines Matillion ETL – Orchestration & Ingestion β€’ Architect Matillion job hierarchies including orchestration jobs, transformation jobs, and reusable profiles β€’ Implement parameterization patterns for scalable multi-source loading β€’ Design error-handling frameworks with audit logging, retry logic, and alerting via Cloud Monitoring β€’ Migrate and refactor legacy ETL pipelines (on-prem or Informatica) into Matillion on GCP DBT Transformation Layer β€’ Build and govern dbt project structures: staging β†’ intermediate β†’ mart model layers β€’ Author reusable macros (Jinja2), generic and singular tests, and schema.yml documentation blocks β€’ Enforce CI/CD pipelines for dbt via GitLab CI / GitHub Actions β€’ Connect dbt docs to DataHub, Collibra, or Alation for enterprise data governance Cloud Composer & Workflow Orchestration β€’ Design and maintain Cloud Composer (Airflow) DAGs to coordinate Matillion jobs, dbt runs, and downstream consumers β€’ Implement SLA monitoring, alerting, and dependency management across cross-domain pipelines DevOps & CI/CD β€’ Manage Git branching strategies across data engineering repositories β€’ Configure Cloud Build / GitHub Actions / GitLab CI pipelines for automated testing, linting, and deployment β€’ Apply infrastructure-as-code (Terraform) for BigQuery datasets, Matillion instances, and IAM resources Clinical Data & Compliance β€’ Map source EHR/EMR, EDC, CTMS, and claims data to CDISC SDTM/ADaM and OMOP CDM standards β€’ Design HL7/FHIR-compatible ingestion pipelines for real-world evidence and interoperability use cases β€’ Ensure platform compliance with GxP, HIPAA, 21 CFR Part 11, and SOC 2 requirements Required Qualifications β€’ 8–14 years in data architecture, data engineering, or analytics engineering roles β€’ 4+ years hands-on with GCP BigQuery: DDL/DML, query optimization, IAM, Data Transfer Service, BigQuery ML β€’ 3+ years with Matillion ETL: job design, orchestration patterns, GCP connectors, Python components β€’ 3+ years with dbt Core and/or Cloud: project structure, macros, testing, documentation, CI/CD integration β€’ Advanced SQL proficiency: window functions, CTEs, performance tuning, and analytical patterns β€’ Proficient in Python for data validation, automation, and custom Matillion/Airflow components β€’ Strong Git-based workflow discipline: branching, code review, pull requests, semantic versioning Preferred Qualifications β€’ Google Professional Data Engineer or Cloud Architect certification β€’ Experience in clinical / pharmaceutical / life-sciences data environments (CDISC, OMOP, HL7/FHIR) β€’ Familiarity with Terraform for GCP infrastructure provisioning β€’ Exposure to dbt Semantic Layer, MetricFlow, or dbt Cloud Enterprise features β€’ Experience with DataHub, Collibra, or Alation for metadata management and data cataloging Background in MDM solutions (Reltio, Informatica MDM) is a plus