TEKNIKOZ

GCP Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a GCP Data Engineer with 6 to 10 years of experience, focusing on BigQuery, Dataflow, and Cloud Composer. Contract length is unspecified, pay rate is "competitive," and remote work is allowed. Requires strong Python, PySpark, and Bash skills.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
Unknown
-
πŸ—“οΈ - Date
February 18, 2026
πŸ•’ - Duration
Unknown
-
🏝️ - Location
Unknown
-
πŸ“„ - Contract
Unknown
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
Phoenix, AZ
-
🧠 - Skills detailed
#Datasets #SQL (Structured Query Language) #Scala #Deployment #Batch #Data Modeling #Data Ingestion #Data Warehouse #"ETL (Extract #Transform #Load)" #Unix #Airflow #BigQuery #Scripting #Version Control #Data Architecture #Cloud #Dataflow #Automation #Data Quality #Agile #Scrum #Data Engineering #GIT #Bash #Data Pipeline #Code Reviews #Programming #Databases #Linux #PySpark #Storage #Data Lake #Data Processing #Computer Science #Apache Airflow #GCP (Google Cloud Platform) #Spark (Apache Spark) #Shell Scripting #Clustering #Python
Role description
Note : Please apply only if you are a Green Card Holder. Role Summary We are looking for a 6 to 10yrs GCP Data Engineer to design, build, and maintain data pipelines and analytics solutions using Google Cloud Platform services, with a strong focus on BigQuery, Dataproc, Dataflow, Cloud Storage, and Cloud Composer/Airflow in a PySpark-driven environment. The ideal candidate has solid experience in batch and streaming ETL, workflow orchestration, and Bash scripting within GCP. Key Responsibilities β€’ Design, develop, and maintain scalable data pipelines and ETL/ELT workflows using BigQuery, Dataproc, Dataflow, and Cloud Storage. β€’ Build and optimize PySpark-based data processing jobs running on Dataproc or serverless Spark in GCP. β€’ Develop, schedule, and monitor DAGs in Cloud Composer/Apache Airflow for end-to-end workflow orchestration. β€’ Implement data ingestion from varied sources into GCP (APIs, files, databases, streaming) and load into BigQuery data warehouse. β€’ Optimize BigQuery queries, partitioning, clustering, and table design for performance and cost efficiency. β€’ Implement data quality checks, validation rules, and reconciliation for critical data pipelines. β€’ Manage and secure GCS buckets, lifecycle policies, and efficient storage structures for raw, curated, and processed data. β€’ Write and maintain Bash/shell scripts for automation, deployment, and operational tasks across GCP environments. β€’ Collaborate with data architects, analysts, and business stakeholders to translate requirements into technical solutions. β€’ Participate in CI/CD, code reviews, and best practices for version control, testing, and deployment of data pipelines. β€’ Monitor, troubleshoot, and resolve issues in production pipelines with clear root cause analysis and incident reporting. Required Skills and Qualifications β€’ Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience). β€’ 6 to 10 years of experience in data engineering with at least 2+ years on Google Cloud Platform. β€’ Strong hands-on experience with: β€’ BigQuery (SQL development, performance tuning, partitioning, clustering). β€’ Dataflow or Dataproc for large-scale data processing. β€’ Cloud Storage (GCS) for data lake design and management. β€’ Cloud Composer / Apache Airflow for DAG development and scheduling. β€’ Strong programming skills in Python and PySpark in a GCP environment. β€’ Proficiency in Linux/Unix and Bash/shell scripting for automation. β€’ Solid understanding of ETL/ELT concepts, data warehousing, and data modeling. β€’ Experience working with large datasets, performance optimization, and cost control on cloud data platforms. β€’ Familiarity with Git and CI/CD practices for data pipelines. β€’ Strong analytical, problem-solving, and communication skills. Good to Have β€’ Google Professional Data Engineer or other GCP certifications. β€’ Experience with additional GCP services such as Pub/Sub, Cloud Functions, Cloud Run, Cloud SQL, Bigtable, or Data Fusion. β€’ Exposure to streaming data pipelines and event-driven architectures. β€’ Experience in Agile/Scrum environments and working with distributed teams.