Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer 4 in San Jose, California, for 5 months at $99.13/hour. Requires 9–12 years of data engineering experience, proficiency in Python, PySpark, SQL, and AWS, with strong skills in data pipeline design and orchestration tools.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
800
-
🗓️ - Date discovered
August 22, 2025
🕒 - Project duration
3 to 6 months
-
🏝️ - Location type
Hybrid
-
📄 - Contract type
W2 Contractor
-
🔒 - Security clearance
Unknown
-
📍 - Location detailed
San Jose, CA
-
🧠 - Skills detailed
#Spark (Apache Spark) #Data Science #AWS (Amazon Web Services) #Data Pipeline #S3 (Amazon Simple Storage Service) #SQL (Structured Query Language) #API (Application Programming Interface) #Security #Data Engineering #Apache Spark #Scala #"ETL (Extract #Transform #Load)" #GIT #Data Processing #Data Quality #Version Control #Data Modeling #Spark SQL #SQS (Simple Queue Service) #Automation #PySpark #Apache Airflow #Lambda (AWS Lambda) #Data Security #Compliance #Scripting #Databricks #Airflow #Cloud #Python #Data Manipulation
Role description
Job Title: Data Engineer 4 Location: San Jose, California (Hybrid – 3 days/week onsite at 345 Park Ave, CASJ 95110) Work Schedule: • Duration: 5 months • Hours: 40 hours/week (8 hours/day), Monday–Friday • Hybrid: 3 days per week onsite, 2 days remote Pay Rate: • $99.13/hour (USD) – W2 only Work Authorization: • W2 Employment Only (No C2C or Independent Contractors) • H-1B Sponsorship: Not Available • Subcontracting: Not Allowed Job Summary / Objective The Data Engineer 4 will design, build, and maintain large-scale, reliable data pipelines and workflows to support data processing and analytics across the enterprise. This role requires strong expertise in Python, PySpark, SQL, cloud platforms, and orchestration tools, with the ability to work independently on complex technical tasks and ensure the highest levels of data quality, compliance, and operational efficiency. Key Responsibilities • Design, develop, and maintain scalable data pipelines for large-scale data processing. • Build and optimize workflows with orchestration tools (e.g., Apache Airflow, Spark) to support scheduled and event-driven ETL/ELT processes. • Implement parsing, cleansing, and transformation logic to normalize data from structured and unstructured sources. • Collaborate with data scientists, analysts, and application teams to integrate and validate data products and pipelines. • Operate and maintain pipelines on cloud platforms (AWS) and distributed compute environments (e.g., Databricks). • Monitor pipeline performance, troubleshoot failures, and perform root cause analysis to ensure reliability and uptime. • Enforce data security, compliance, and governance across systems and environments. • Drive automation and standardization of data engineering processes to improve development velocity and operational efficiency. Skills & Experience Required • 9–12 years of experience in data engineering. • Proficiency in Python and PySpark for data processing and scripting. • Strong experience with SQL for data manipulation and optimization. • Deep knowledge of distributed data processing with Apache Spark. • Hands-on experience with Airflow or similar orchestration tools. • Expertise with Databricks for collaborative data engineering and analytics. • Experience with AWS services (S3, Lambda, SQS, API Gateway, Networking). • Strong understanding of data modeling, data warehousing, and pipeline architecture best practices. • Familiarity with CI/CD practices and version control (Git). • Strong analytical and problem-solving skills; able to work independently on complex tasks.