Damcosoft

GCP Data Architect(Concord, CA (Onsite))

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a GCP Data Architect in Concord, CA, on a long-term contract with a pay rate of "unknown." Key skills include 5+ years in data engineering, GCP proficiency, and expertise in Teradata or Hadoop. Certifications preferred.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
February 21, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
On-site
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Concord, CA
-
🧠 - Skills detailed
#Dataflow #Data Warehouse #Programming #Data Integrity #Scala #BigQuery #Hadoop #Data Management #Airflow #Cloud #Data Modeling #Clustering #Spark (Apache Spark) #Batch #Data Lake #Security #Database Migration #Teradata #PySpark #Python #HDFS (Hadoop Distributed File System) #Data Architecture #SQL (Structured Query Language) #Storage #Data Governance #BTEQ #Google Cloud Storage #Big Data #Informatica PowerCenter #Migration #Data Pipeline #Business Analysis #GCP (Google Cloud Platform) #Data Security #Database Design #Deployment #MLOAD (MultiLoad) #Informatica #"ETL (Extract #Transform #Load)" #Data Catalog #YARN (Yet Another Resource Negotiator) #FastLoad #Data Processing #Libraries #Data Engineering
Role description
Role : GCP Data Architect Location : Concord, CA (Onsite) Long Term Job Description What You'll Do (Responsibilities): Architect & Design: Design and implement robust, scalable, and cost-effective data solutions on Google Cloud, serving as the target architecture for migrated workloads. Develop Reusable Frameworks & Accelerators: Design, build, and maintain reusable frameworks, templates, and code libraries to standardize and accelerate data engineering work. This includes creating boilerplate pipeline structures, generic data validation modules, and automated deployment patterns that other engineers will leverage. Migrate & Modernize: Lead the hands-on migration of data and processes from on-premises systems like Teradata and Hadoop to Google Cloud services, with a primary focus on BigQuery, Google Cloud Storage (GCS), Dataflow, and Dataproc. ETL/ELT Transformation: Analyze, deconstruct, and translate complex legacy ETL logic from tools like Informatica and Teradata BTEQ/Stored Procedures into modern, cloud-native pipelines, leveraging the frameworks/tooling you help create. Pipeline Development: Build and automate new data pipelines for batch and streaming data using Python, SQL, and GCP's core services, ensuring all new development contributes to and benefits from our shared engineering frameworks. Performance & Cost Optimization: Proactively optimize BigQuery performance through effective partitioning, clustering, and query tuning. Data Validation & Governance: Develop and implement rigorous data validation frameworks to ensure data integrity and accuracy post-migration. Collaborate with governance teams to apply data security, lineage, and cataloging using tools like Google Cloud Data Catalog and Dataplex. Collaboration & Mentorship: Work closely with on-premises data experts, business analysts, and other engineers to understand requirements, ensure a smooth transition, and act as a subject matter expert and mentor for GCP and internal framework best practices. Required Qualifications (Must-Haves): Professional Experience: 5+ years of professional experience in a data engineering role, with a proven track record of building and maintaining large-scale data systems. Framework Design & Development: Demonstrable experience designing, building, and promoting the adoption of reusable data engineering frameworks such as Ingestion, Transformation, Validation etc On-Premises Data Warehouse/Big Data Expertise: Deep, hands-on experience with at least one of the following on-premise ecosystems: Teradata: Strong understanding of the Teradata architecture, utilities (BTEQ, TPT, FastLoad/MultiLoad), and advanced SQL/Stored Procedure development. Hadoop: Experience with the Hadoop ecosystem (HDFS, YARN, Hive) and hands-on proficiency with PySpark for large-scale data processing. Enterprise ETL: Demonstrable experience designing and building complex workflows in an enterprise ETL tool like Informatica PowerCenter. Google Cloud Platform (GCP) Proficiency: Demonstrable hands-on experience designing, building, and operating solutions with a comprehensive set of GCP data services, including: Core Data Processing & Warehousing: Google BigQuery (including data modeling, performance tuning, cost management), Google Cloud Storage (GCS), Cloud Dataflow, and Cloud Dataproc. Orchestration & Event-Driven Architecture: Cloud Composer (Managed Airflow) for complex workflow orchestration, and Pub/Sub and Cloud Functions for building streaming and event-driven data pipelines. Data Governance & Management: Practical experience using Dataplex for unified data management, security, and governance across data lakes and warehouses. Core Engineering & Migration Skills: Expert-level proficiency in SQL, including complex joins, window functions, and performance tuning across different database engines. Strong programming skills in Python, applying software engineering best practices. Hands-on experience with Google's migration assessment tools (e.g., BigQuery Migration Service, Database Migration Service) to analyze on-premise workloads and accelerate migration. Deep understanding of data warehousing concepts, ETL/ELT patterns, data modeling, and database design. Preferred Qualifications (Nice-to-Haves): Proven Migration Experience: Direct, hands-on experience successfully completing at least one large-scale on-premise (Teradata, Hadoop, etc.) to Google Cloud Platform. Certifications: A Google Cloud Professional Data Engineer certification is a plus.