

VySystems
Data Architect
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Architect focused on IAM Data Modernization, with a contract length of "unknown" and a pay rate of "$X per hour". Key skills include GCP expertise, data lake architecture, and data ingestion. Requires 10-14 years of experience and Google Cloud Professional Cloud Architect certification.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
March 6, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Dallas, TX
-
🧠 - Skills detailed
#"ETL (Extract #Transform #Load)" #Data Management #AI (Artificial Intelligence) #Data Catalog #Data Architecture #Hadoop #Programming #Observability #Datasets #Data Pipeline #Compliance #SQL (Structured Query Language) #Sqoop (Apache Sqoop) #Cloud #Security #Airflow #Dataflow #Monitoring #Data Governance #Data Warehouse #Pig #Trend Analysis #Apache Beam #DevOps #Schema Design #Logging #Batch #Scala #Data Engineering #Data Lake #Migration #Metadata #BI (Business Intelligence) #Data Lineage #HDFS (Hadoop Distributed File System) #Strategy #Clustering #BigQuery #Computer Science #Data Quality #Spark (Apache Spark) #Data Processing #Storage #Data Ingestion #GCP (Google Cloud Platform) #Data Strategy #IAM (Identity and Access Management) #Python
Role description
Identity & Access Management (IAM) Data Modernization– migration of an on‑premises SQL data warehouse to a target‑stateData Lake on Google Cloud (GCP), enabling metrics & reporting, advanced analytics, andGenAIuse cases (natural language querying, accelerated summarization, cross‑domain trend analysis).
About Program/Project
The IAM Data Modernization project involves migrating an on-premises SQL data warehouse to a target state Data Lake in GCP cloud environment. Key highlights include:
• Integration Scope:30+ source system data ingestions and multiple downstream integrations
• Capabilities:Metrics, reporting, and Gen AI use cases with natural language querying, advanced pattern/trend analysis, faster summarizations, and cross-domain metric monitoring
• Benefits:
• Scalability and access to advanced cloud functionality
• Highly available and performant semantic layer with historical data support
• Unified data strategy for executive reporting, analytics, and Gen AI across cyber domains
This modernization establishes a single source of truth for enterprise-wide data-driven decision-making.
Required Skills
Data Lake Architecture & Storage
• Proven experience designing and implementingdata lake architectures(e.g., Bronze/Silver/Gold or layered models).
• Strong knowledge ofCloud Storage (GCS)design, including bucket layout, naming conventions, lifecycle policies, and access controls
·ExperiencewithHadoop/HDFS architecture, distributed file systems, and data locality principles
• Hands-on experience withcolumnar data formats(Parquet, Avro, ORC) and compression techniques
• Expertise inpartitioning strategies, backfills, and large-scale data organization
• Ability to designdata modelsoptimized for analytics and BI consumption
Qualifications
• Experience:[10–14]+ years in data engineering/architecture,5+years designing onGCPat scale; prior on‑prem → cloud migration a must.
• Education:Bachelor’s/Master’s in Computer Science, Information Systems, or equivalent experience.
• Certifications:Google Cloud Professional Cloud Architect(required or within 3 months).Plus:Professional Data Engineer, Security Engineer.
Data Ingestion & Orchestration
·Experience buildingbatch and streaming ingestion pipelinesusing GCP-native services
·Knowledge ofPub/Sub-based streaming architectures, event schema design, and versioning
·Strong understanding ofincremental ingestion and CDC patterns, including idempotency and deduplication
·Hands-on experience withworkflow orchestrationtools (Cloud Composer / Airflow)
·Ability to design robusterror handling, replay, and backfill mechanisms
Data Processing & Transformation
·Experience developing scalablebatch and streaming pipelinesusing Dataflow (Apache Beam) and/or Spark (Dataproc)
·Strong proficiency inBigQuery SQL, including query optimization, partitioning, clustering, and cost control.
·Hands-on experience with HadoopMapReduceand ecosystem tools (Hive, Pig, Sqoop)
·AdvancedPython programming skillsfor data engineering, including testing and maintainable code design
·Experience managingschema evolutionwhile minimizing downstream impact
Analytics & Data Serving
·Expertise inBigQuery performance optimizationand data serving patterns
·Experience buildingsemantic layers and governed metricsfor consistent analytics
·Familiarity withBI integration, access controls, and dashboard standards
·Understanding of data exposure patterns viaviews, APIs, or curated datasets
Data Governance, Quality & Metadata
·Experience implementingdata catalogs, metadata management, and ownership models
·Understanding ofdata lineagefor auditability and troubleshooting
·Strong focus ondata quality frameworks, including validation, freshness checks, and alerting
·Experience defining and enforcingdata contracts, schemas, and SLAs
·Familiarity withaudit logging and compliance readiness
Cloud Platform Management
·Strong hands-on experience withGoogle Cloud Platform (GCP), including project setup, environment separation, billing, quotas, and cost controls
·Expertise inIAM and security best practices, including least-privilege access, service accounts, and role-based access
·Solid understanding ofVPC networking, private access patterns, and secure service connectivity
·Experience withencryption and key management(KMS, CMEK) and security auditing
DevOps, Platform & Reliability
·Proven ability to buildCI/CD pipelinesfor data and infrastructure workloads
·Experience managingsecretssecurely using GCP Secret Manager
·Ownership ofobservability, SLOs, dashboards, alerts, and runbooks
·Proficiency inlogging, monitoring, and alertingfor data pipelines and platform reliability
Good to have
Security, Privacy & Compliance
·Hands-on experience implementingfine-grained access controlsfor BigQuery and GCS
·Experience withVPC Service Controlsand data exfiltration prevention
·Knowledge ofPII handling, data masking, tokenization, and audit requirements
Identity & Access Management (IAM) Data Modernization– migration of an on‑premises SQL data warehouse to a target‑stateData Lake on Google Cloud (GCP), enabling metrics & reporting, advanced analytics, andGenAIuse cases (natural language querying, accelerated summarization, cross‑domain trend analysis).
About Program/Project
The IAM Data Modernization project involves migrating an on-premises SQL data warehouse to a target state Data Lake in GCP cloud environment. Key highlights include:
• Integration Scope:30+ source system data ingestions and multiple downstream integrations
• Capabilities:Metrics, reporting, and Gen AI use cases with natural language querying, advanced pattern/trend analysis, faster summarizations, and cross-domain metric monitoring
• Benefits:
• Scalability and access to advanced cloud functionality
• Highly available and performant semantic layer with historical data support
• Unified data strategy for executive reporting, analytics, and Gen AI across cyber domains
This modernization establishes a single source of truth for enterprise-wide data-driven decision-making.
Required Skills
Data Lake Architecture & Storage
• Proven experience designing and implementingdata lake architectures(e.g., Bronze/Silver/Gold or layered models).
• Strong knowledge ofCloud Storage (GCS)design, including bucket layout, naming conventions, lifecycle policies, and access controls
·ExperiencewithHadoop/HDFS architecture, distributed file systems, and data locality principles
• Hands-on experience withcolumnar data formats(Parquet, Avro, ORC) and compression techniques
• Expertise inpartitioning strategies, backfills, and large-scale data organization
• Ability to designdata modelsoptimized for analytics and BI consumption
Qualifications
• Experience:[10–14]+ years in data engineering/architecture,5+years designing onGCPat scale; prior on‑prem → cloud migration a must.
• Education:Bachelor’s/Master’s in Computer Science, Information Systems, or equivalent experience.
• Certifications:Google Cloud Professional Cloud Architect(required or within 3 months).Plus:Professional Data Engineer, Security Engineer.
Data Ingestion & Orchestration
·Experience buildingbatch and streaming ingestion pipelinesusing GCP-native services
·Knowledge ofPub/Sub-based streaming architectures, event schema design, and versioning
·Strong understanding ofincremental ingestion and CDC patterns, including idempotency and deduplication
·Hands-on experience withworkflow orchestrationtools (Cloud Composer / Airflow)
·Ability to design robusterror handling, replay, and backfill mechanisms
Data Processing & Transformation
·Experience developing scalablebatch and streaming pipelinesusing Dataflow (Apache Beam) and/or Spark (Dataproc)
·Strong proficiency inBigQuery SQL, including query optimization, partitioning, clustering, and cost control.
·Hands-on experience with HadoopMapReduceand ecosystem tools (Hive, Pig, Sqoop)
·AdvancedPython programming skillsfor data engineering, including testing and maintainable code design
·Experience managingschema evolutionwhile minimizing downstream impact
Analytics & Data Serving
·Expertise inBigQuery performance optimizationand data serving patterns
·Experience buildingsemantic layers and governed metricsfor consistent analytics
·Familiarity withBI integration, access controls, and dashboard standards
·Understanding of data exposure patterns viaviews, APIs, or curated datasets
Data Governance, Quality & Metadata
·Experience implementingdata catalogs, metadata management, and ownership models
·Understanding ofdata lineagefor auditability and troubleshooting
·Strong focus ondata quality frameworks, including validation, freshness checks, and alerting
·Experience defining and enforcingdata contracts, schemas, and SLAs
·Familiarity withaudit logging and compliance readiness
Cloud Platform Management
·Strong hands-on experience withGoogle Cloud Platform (GCP), including project setup, environment separation, billing, quotas, and cost controls
·Expertise inIAM and security best practices, including least-privilege access, service accounts, and role-based access
·Solid understanding ofVPC networking, private access patterns, and secure service connectivity
·Experience withencryption and key management(KMS, CMEK) and security auditing
DevOps, Platform & Reliability
·Proven ability to buildCI/CD pipelinesfor data and infrastructure workloads
·Experience managingsecretssecurely using GCP Secret Manager
·Ownership ofobservability, SLOs, dashboards, alerts, and runbooks
·Proficiency inlogging, monitoring, and alertingfor data pipelines and platform reliability
Good to have
Security, Privacy & Compliance
·Hands-on experience implementingfine-grained access controlsfor BigQuery and GCS
·Experience withVPC Service Controlsand data exfiltration prevention
·Knowledge ofPII handling, data masking, tokenization, and audit requirements






