

Avalon Information Technologies L.L.C
Data Architect - Health Care/Life Sciences| GCP Β· Matillion Β· DBT - Remote Contract Position
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Architect in Health Care/Life Sciences, offering a remote contract position for 6-12 months at a pay rate of "$X". Requires 8-14 years of experience, expertise in GCP BigQuery, Matillion, and dbt, plus clinical data knowledge.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
Unknown
-
ποΈ - Date
May 10, 2026
π - Duration
Unknown
-
ποΈ - Location
Unknown
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
United States
-
π§ - Skills detailed
#Automation #BigQuery #dbt (data build tool) #Matillion #Metadata #DevOps #Deployment #Clustering #Data Quality #Datasets #GIT #ADaM (Analysis Data Model) #Compliance #IAM (Identity and Access Management) #Collibra #SQL (Structured Query Language) #GCP (Google Cloud Platform) #CDISC (Clinical Data Interchange Standards Consortium) #VPC (Virtual Private Cloud) #Informatica #FHIR (Fast Healthcare Interoperability Resources) #Scala #ML (Machine Learning) #AI (Artificial Intelligence) #Security #Storage #Documentation #Data Engineering #GitLab #Python #Data Management #GitHub #Alation #Data Architecture #Terraform #Macros #Airflow #Cloud #Logging #Data Governance #MDM (Master Data Management) #Data Catalog #"ETL (Extract #Transform #Load)" #Automated Testing #Monitoring
Role description
Role Overview
We are seeking an experienced Data Architect with deep, hands-on expertise in GCP BigQuery, Matillion ETL, and dbt (Core/Cloud). You will lead the design and delivery of cloud-native, scalable data platforms for clinical and life-sciences clients, establishing architecture patterns that span ingestion, transformation, governance, and analytics.
This is a senior individual-contributor and technical-lead role with high visibility into client delivery and pre-sales solutioning.
Core Technology Stack
Cloud PlatformΒ : GCP Β· BigQuery Β· Cloud Storage Β· Cloud Composer (Airflow)
Cloud Build ETL / Orchestration : Matillion ETL Β· Cloud Composer Β· GitLab CI / GitHub Actions
Transformation & Modeling: dbt Core / Cloud Β· Staging Β· Intermediate Β· Marts Β· Macros Β· Tests
Languages: SQL (Advanced) Β· Python Β· Jinja2
Clinical Standards: CDISC SDTM/ADaM Β· OMOP Β· HL7/FHIR Β· EHR/EMR Β· EDC/CTMS
DevOps & Governance: Git Β· Code Review Β· CI/CD Β· Data Quality Β· Audit Logging
Key Responsibilities
GCP BigQuery Architecture
β’ Design multi-project BigQuery environments aligned to medallion architecture principles
β’ Optimize query performance using partitioning, clustering, and materialization strategies
β’ Enforce dataset-level IAM, column-level security, and VPC Service Controls for regulated data
β’ Integrate BigQuery with Cloud Storage, Pub/Sub, and Vertex AI for end-to-end analytics pipelines
Matillion ETL β Orchestration & Ingestion
β’ Architect Matillion job hierarchies including orchestration jobs, transformation jobs, and reusable profiles
β’ Implement parameterization patterns for scalable multi-source loading
β’ Design error-handling frameworks with audit logging, retry logic, and alerting via Cloud Monitoring
β’ Migrate and refactor legacy ETL pipelines (on-prem or Informatica) into Matillion on GCP
DBT Transformation Layer
β’ Build and govern dbt project structures: staging β intermediate β mart model layers
β’ Author reusable macros (Jinja2), generic and singular tests, and schema.yml documentation blocks
β’ Enforce CI/CD pipelines for dbt via GitLab CI / GitHub Actions
β’ Connect dbt docs to DataHub, Collibra, or Alation for enterprise data governance
Cloud Composer & Workflow Orchestration
β’ Design and maintain Cloud Composer (Airflow) DAGs to coordinate Matillion jobs, dbt runs, and downstream consumers
β’ Implement SLA monitoring, alerting, and dependency management across cross-domain pipelines
DevOps & CI/CD
β’ Manage Git branching strategies across data engineering repositories
β’ Configure Cloud Build / GitHub Actions / GitLab CI pipelines for automated testing, linting, and deployment
β’ Apply infrastructure-as-code (Terraform) for BigQuery datasets, Matillion instances, and IAM resources
Clinical Data & Compliance
β’ Map source EHR/EMR, EDC, CTMS, and claims data to CDISC SDTM/ADaM and OMOP CDM standards
β’ Design HL7/FHIR-compatible ingestion pipelines for real-world evidence and interoperability use cases
β’ Ensure platform compliance with GxP, HIPAA, 21 CFR Part 11, and SOC 2 requirements
Required Qualifications
β’ 8β14 years in data architecture, data engineering, or analytics engineering roles
β’ 4+ years hands-on with GCP BigQuery: DDL/DML, query optimization, IAM, Data Transfer Service, BigQuery ML
β’ 3+ years with Matillion ETL: job design, orchestration patterns, GCP connectors, Python components
β’ 3+ years with dbt Core and/or Cloud: project structure, macros, testing, documentation, CI/CD integration
β’ Advanced SQL proficiency: window functions, CTEs, performance tuning, and analytical patterns
β’ Proficient in Python for data validation, automation, and custom Matillion/Airflow components
β’ Strong Git-based workflow discipline: branching, code review, pull requests, semantic versioning
Preferred Qualifications
β’ Google Professional Data Engineer or Cloud Architect certification
β’ Experience in clinical / pharmaceutical / life-sciences data environments (CDISC, OMOP, HL7/FHIR)
β’ Familiarity with Terraform for GCP infrastructure provisioning
β’ Exposure to dbt Semantic Layer, MetricFlow, or dbt Cloud Enterprise features
β’ Experience with DataHub, Collibra, or Alation for metadata management and data cataloging
Background in MDM solutions (Reltio, Informatica MDM) is a plus
Role Overview
We are seeking an experienced Data Architect with deep, hands-on expertise in GCP BigQuery, Matillion ETL, and dbt (Core/Cloud). You will lead the design and delivery of cloud-native, scalable data platforms for clinical and life-sciences clients, establishing architecture patterns that span ingestion, transformation, governance, and analytics.
This is a senior individual-contributor and technical-lead role with high visibility into client delivery and pre-sales solutioning.
Core Technology Stack
Cloud PlatformΒ : GCP Β· BigQuery Β· Cloud Storage Β· Cloud Composer (Airflow)
Cloud Build ETL / Orchestration : Matillion ETL Β· Cloud Composer Β· GitLab CI / GitHub Actions
Transformation & Modeling: dbt Core / Cloud Β· Staging Β· Intermediate Β· Marts Β· Macros Β· Tests
Languages: SQL (Advanced) Β· Python Β· Jinja2
Clinical Standards: CDISC SDTM/ADaM Β· OMOP Β· HL7/FHIR Β· EHR/EMR Β· EDC/CTMS
DevOps & Governance: Git Β· Code Review Β· CI/CD Β· Data Quality Β· Audit Logging
Key Responsibilities
GCP BigQuery Architecture
β’ Design multi-project BigQuery environments aligned to medallion architecture principles
β’ Optimize query performance using partitioning, clustering, and materialization strategies
β’ Enforce dataset-level IAM, column-level security, and VPC Service Controls for regulated data
β’ Integrate BigQuery with Cloud Storage, Pub/Sub, and Vertex AI for end-to-end analytics pipelines
Matillion ETL β Orchestration & Ingestion
β’ Architect Matillion job hierarchies including orchestration jobs, transformation jobs, and reusable profiles
β’ Implement parameterization patterns for scalable multi-source loading
β’ Design error-handling frameworks with audit logging, retry logic, and alerting via Cloud Monitoring
β’ Migrate and refactor legacy ETL pipelines (on-prem or Informatica) into Matillion on GCP
DBT Transformation Layer
β’ Build and govern dbt project structures: staging β intermediate β mart model layers
β’ Author reusable macros (Jinja2), generic and singular tests, and schema.yml documentation blocks
β’ Enforce CI/CD pipelines for dbt via GitLab CI / GitHub Actions
β’ Connect dbt docs to DataHub, Collibra, or Alation for enterprise data governance
Cloud Composer & Workflow Orchestration
β’ Design and maintain Cloud Composer (Airflow) DAGs to coordinate Matillion jobs, dbt runs, and downstream consumers
β’ Implement SLA monitoring, alerting, and dependency management across cross-domain pipelines
DevOps & CI/CD
β’ Manage Git branching strategies across data engineering repositories
β’ Configure Cloud Build / GitHub Actions / GitLab CI pipelines for automated testing, linting, and deployment
β’ Apply infrastructure-as-code (Terraform) for BigQuery datasets, Matillion instances, and IAM resources
Clinical Data & Compliance
β’ Map source EHR/EMR, EDC, CTMS, and claims data to CDISC SDTM/ADaM and OMOP CDM standards
β’ Design HL7/FHIR-compatible ingestion pipelines for real-world evidence and interoperability use cases
β’ Ensure platform compliance with GxP, HIPAA, 21 CFR Part 11, and SOC 2 requirements
Required Qualifications
β’ 8β14 years in data architecture, data engineering, or analytics engineering roles
β’ 4+ years hands-on with GCP BigQuery: DDL/DML, query optimization, IAM, Data Transfer Service, BigQuery ML
β’ 3+ years with Matillion ETL: job design, orchestration patterns, GCP connectors, Python components
β’ 3+ years with dbt Core and/or Cloud: project structure, macros, testing, documentation, CI/CD integration
β’ Advanced SQL proficiency: window functions, CTEs, performance tuning, and analytical patterns
β’ Proficient in Python for data validation, automation, and custom Matillion/Airflow components
β’ Strong Git-based workflow discipline: branching, code review, pull requests, semantic versioning
Preferred Qualifications
β’ Google Professional Data Engineer or Cloud Architect certification
β’ Experience in clinical / pharmaceutical / life-sciences data environments (CDISC, OMOP, HL7/FHIR)
β’ Familiarity with Terraform for GCP infrastructure provisioning
β’ Exposure to dbt Semantic Layer, MetricFlow, or dbt Cloud Enterprise features
β’ Experience with DataHub, Collibra, or Alation for metadata management and data cataloging
Background in MDM solutions (Reltio, Informatica MDM) is a plus





