

Rivago Infotech Inc
Data Architect (With Open Shift)
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Architect (Google Cloud) with a contract length of "unknown" and a pay rate of "unknown." Located in Dallas, TX or Charlotte, NC (Hybrid), it requires 10-14 years of experience, expertise in GCP, OCP, and PySpark, and relevant certifications.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
May 12, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Dallas, TX
-
🧠 - Skills detailed
#Dataflow #Migration #Storage #Logging #Monitoring #Data Lake #Apache Beam #Strategy #Hadoop #Datasets #Data Management #Data Governance #Batch #Data Processing #Data Strategy #GIT #Metadata #Data Pipeline #Schema Design #BigQuery #Security #Clustering #Airflow #Cloud #GCP (Google Cloud Platform) #Data Lineage #Big Data #Spark (Apache Spark) #Observability #Data Engineering #Programming #AI (Artificial Intelligence) #Deployment #DevOps #Data Architecture #BI (Business Intelligence) #Containers #Pig #Data Quality #"ETL (Extract #Transform #Load)" #Scala #Computer Science #HDFS (Hadoop Distributed File System) #IAM (Identity and Access Management) #Trend Analysis #Sqoop (Apache Sqoop) #PySpark #Data Catalog #Compliance #Data Ingestion #Data Warehouse #SQL (Structured Query Language) #Python #Automation
Role description
Role: Google Cloud Data Architect – IAM Data Modernization
Location: Dallas, TX / Charlotte, NC (Hybrid – 4 days office)
Highly Preferred OCP exp
Project/Program
Identity & Access Management (IAM) Data Modernization – migration of an on‑premises SQL data warehouse to a target‑state Data Lake on Google Cloud (GCP), enabling metrics & reporting, advanced analytics, and GenAI use cases (natural language querying, accelerated summarization, cross‑domain trend analysis) leveraging PySpark‑based processing, cloud‑native DevOps CI/CD pipelines, and containerized deployments on OpenShift (OCP) to deliver scalable, secure, and high‑performance data solutions.
About Program/Project
The IAM Data Modernization project involves migrating an on-premises SQL data warehouse to a target state Data Lake in GCP cloud environment. Key highlights include:
• Integration Scope: 30+ source system data ingestions and multiple downstream integrations
• Capabilities: Metrics, reporting, and Gen AI use cases with natural language querying, advanced pattern/trend analysis, faster summarizations, and cross-domain metric monitoring
• Benefits:
• Scalability and access to advanced cloud functionality
• Highly available and performant semantic layer with historical data support
• Unified data strategy for executive reporting, analytics, and Gen AI across cyber domains
This modernization establishes a single source of truth for enterprise-wide data-driven decision-making.
Required Skills
DevOps / CI‑CD
• Experience implementing CI/CD pipelines for data and analytics workloads
• Familiarity with Git‑based source control, build automation, and deployment strategies
Containers & Platform
• Experience with OpenShift Container Platform (OCP) for deploying data workloads and services
• Understanding of containerized architecture, scaling, and environment management
• Proven ability to build CI/CD pipelines for data and infrastructure workloads
• Experience managing secrets securely using GCP Secret Manager
• Ownership of observability, SLOs, dashboards, alerts, and runbooks
• Proficiency in logging, monitoring, and alerting for data pipelines and platform reliability
Big Data & Processing
• Hands‑on experience with PySpark for ETL/ELT, data transformation, and performance optimization
• Solid understanding of distributed data processing concepts
Data & Cloud Architecture
• Strong experience designing data platforms on Google Cloud Platform (GCP)
• Experience with Data Lakes, data warehousing, and large‑scale migration programs
Data Lake Architecture & Storage
• Proven experience designing and implementing data lake architectures (e.g., Bronze/Silver/Gold or layered models).
• Strong knowledge of Cloud Storage (GCS) design, including bucket layout, naming conventions, lifecycle policies, and access controls
· Experience with Hadoop/HDFS architecture, distributed file systems, and data locality principles
• Hands-on experience with columnar data formats (Parquet, Avro, ORC) and compression techniques
• Expertise in partitioning strategies, backfills, and large-scale data organization
• Ability to design data models optimized for analytics and BI consumption
Data Ingestion & Orchestration
· Experience building batch and streaming ingestion pipelines using GCP-native services
· Knowledge of Pub/Sub-based streaming architectures, event schema design, and versioning
· Strong understanding of incremental ingestion and CDC patterns, including idempotency and deduplication
· Hands-on experience with workflow orchestration tools (Cloud Composer / Airflow)
· Ability to design robust error handling, replay, and backfill mechanisms
Data Processing & Transformation
· Experience developing scalable batch and streaming pipelines using Dataflow (Apache Beam) and/or Spark (Dataproc)
· Strong proficiency in BigQuery SQL, including query optimization, partitioning, clustering, and cost control.
· Hands-on experience with Hadoop MapReduce and ecosystem tools (Hive, Pig, Sqoop)
· Advanced Python programming skills for data engineering, including testing and maintainable code design
· Experience managing schema evolution while minimizing downstream impact
Analytics & Data Serving
· Expertise in BigQuery performance optimization and data serving patterns
· Experience building semantic layers and governed metrics for consistent analytics
· Familiarity with BI integration, access controls, and dashboard standards
· Understanding of data exposure patterns via views, APIs, or curated datasets
Data Governance, Quality & Metadata
· Experience implementing data catalogs, metadata management, and ownership models
· Understanding of data lineage for auditability and troubleshooting
· Strong focus on data quality frameworks, including validation, freshness checks, and alerting
· Experience defining and enforcing data contracts, schemas, and SLAs
Good to have
Security, Privacy & Compliance
· Hands-on experience implementing fine-grained access controls for BigQuery and GCS
· Experience with Sprint planning and helping team technically.
· Strong stakeholder communication and solution‑architecture skills
Qualifications
• Experience: [10–14]+ years in DevOps and Data Architecture, 5+ years designing on Pyspark/GCP/OCP at scale; prior on‑prem → cloud migration a must.
• Education: Bachelor’s/Master’s in Computer Science, Information Systems, or equivalent experience.
• Certifications: Google Cloud Professional Cloud Architect/DevOps/OCP (required or within 3 months). Plus: Professional Data Engineer, Security Engineer.
Role: Google Cloud Data Architect – IAM Data Modernization
Location: Dallas, TX / Charlotte, NC (Hybrid – 4 days office)
Highly Preferred OCP exp
Project/Program
Identity & Access Management (IAM) Data Modernization – migration of an on‑premises SQL data warehouse to a target‑state Data Lake on Google Cloud (GCP), enabling metrics & reporting, advanced analytics, and GenAI use cases (natural language querying, accelerated summarization, cross‑domain trend analysis) leveraging PySpark‑based processing, cloud‑native DevOps CI/CD pipelines, and containerized deployments on OpenShift (OCP) to deliver scalable, secure, and high‑performance data solutions.
About Program/Project
The IAM Data Modernization project involves migrating an on-premises SQL data warehouse to a target state Data Lake in GCP cloud environment. Key highlights include:
• Integration Scope: 30+ source system data ingestions and multiple downstream integrations
• Capabilities: Metrics, reporting, and Gen AI use cases with natural language querying, advanced pattern/trend analysis, faster summarizations, and cross-domain metric monitoring
• Benefits:
• Scalability and access to advanced cloud functionality
• Highly available and performant semantic layer with historical data support
• Unified data strategy for executive reporting, analytics, and Gen AI across cyber domains
This modernization establishes a single source of truth for enterprise-wide data-driven decision-making.
Required Skills
DevOps / CI‑CD
• Experience implementing CI/CD pipelines for data and analytics workloads
• Familiarity with Git‑based source control, build automation, and deployment strategies
Containers & Platform
• Experience with OpenShift Container Platform (OCP) for deploying data workloads and services
• Understanding of containerized architecture, scaling, and environment management
• Proven ability to build CI/CD pipelines for data and infrastructure workloads
• Experience managing secrets securely using GCP Secret Manager
• Ownership of observability, SLOs, dashboards, alerts, and runbooks
• Proficiency in logging, monitoring, and alerting for data pipelines and platform reliability
Big Data & Processing
• Hands‑on experience with PySpark for ETL/ELT, data transformation, and performance optimization
• Solid understanding of distributed data processing concepts
Data & Cloud Architecture
• Strong experience designing data platforms on Google Cloud Platform (GCP)
• Experience with Data Lakes, data warehousing, and large‑scale migration programs
Data Lake Architecture & Storage
• Proven experience designing and implementing data lake architectures (e.g., Bronze/Silver/Gold or layered models).
• Strong knowledge of Cloud Storage (GCS) design, including bucket layout, naming conventions, lifecycle policies, and access controls
· Experience with Hadoop/HDFS architecture, distributed file systems, and data locality principles
• Hands-on experience with columnar data formats (Parquet, Avro, ORC) and compression techniques
• Expertise in partitioning strategies, backfills, and large-scale data organization
• Ability to design data models optimized for analytics and BI consumption
Data Ingestion & Orchestration
· Experience building batch and streaming ingestion pipelines using GCP-native services
· Knowledge of Pub/Sub-based streaming architectures, event schema design, and versioning
· Strong understanding of incremental ingestion and CDC patterns, including idempotency and deduplication
· Hands-on experience with workflow orchestration tools (Cloud Composer / Airflow)
· Ability to design robust error handling, replay, and backfill mechanisms
Data Processing & Transformation
· Experience developing scalable batch and streaming pipelines using Dataflow (Apache Beam) and/or Spark (Dataproc)
· Strong proficiency in BigQuery SQL, including query optimization, partitioning, clustering, and cost control.
· Hands-on experience with Hadoop MapReduce and ecosystem tools (Hive, Pig, Sqoop)
· Advanced Python programming skills for data engineering, including testing and maintainable code design
· Experience managing schema evolution while minimizing downstream impact
Analytics & Data Serving
· Expertise in BigQuery performance optimization and data serving patterns
· Experience building semantic layers and governed metrics for consistent analytics
· Familiarity with BI integration, access controls, and dashboard standards
· Understanding of data exposure patterns via views, APIs, or curated datasets
Data Governance, Quality & Metadata
· Experience implementing data catalogs, metadata management, and ownership models
· Understanding of data lineage for auditability and troubleshooting
· Strong focus on data quality frameworks, including validation, freshness checks, and alerting
· Experience defining and enforcing data contracts, schemas, and SLAs
Good to have
Security, Privacy & Compliance
· Hands-on experience implementing fine-grained access controls for BigQuery and GCS
· Experience with Sprint planning and helping team technically.
· Strong stakeholder communication and solution‑architecture skills
Qualifications
• Experience: [10–14]+ years in DevOps and Data Architecture, 5+ years designing on Pyspark/GCP/OCP at scale; prior on‑prem → cloud migration a must.
• Education: Bachelor’s/Master’s in Computer Science, Information Systems, or equivalent experience.
• Certifications: Google Cloud Professional Cloud Architect/DevOps/OCP (required or within 3 months). Plus: Professional Data Engineer, Security Engineer.






