Infotree Global Solutions

ONLY W2 & LOCAL CANDIDATES TO Sunnyvale, CA “OR” San Diego, CA :: Sr. Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Sr. Data Engineer in Sunnyvale or San Diego, CA, with a contract length of “unknown” and a pay rate of “unknown.” Requires 8+ years of ETL experience, proficiency in Apache Spark, and expertise in cloud platforms.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
November 13, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
Sunnyvale, CA
-
🧠 - Skills detailed
#Automated Testing #Deployment #Cloud #Apache Spark #Apache Iceberg #ML (Machine Learning) #BigQuery #Leadership #Dataflow #GIT #Data Engineering #Computer Science #Data Modeling #Scala #Delta Lake #Data Layers #Kubernetes #Observability #Storage #Apache Airflow #Java #Data Lake #Data Lakehouse #Airflow #Data Management #GCP (Google Cloud Platform) #Prometheus #AWS (Amazon Web Services) #Azure #Batch #"ETL (Extract #Transform #Load)" #Python #Datasets #Spark (Apache Spark) #Monitoring #Grafana #Kafka (Apache Kafka) #Databricks
Role description
Responsibilities: • End-to-End ETL Mastery: Proven experience architecting, operating, and continuously scaling petabyte-class ETL/ELT platforms that power mission-critical analytics and ML workloads across bronze/silver/gold data layers. • Architecture Leadership: Ability to craft multi-year data platform roadmaps, drive architectural decisions, and align stakeholders around standards for quality, performance, and cost efficiency. • Spark-Centric Engineering: Deep hands-on proficiency with Apache Spark (batch and structured streaming) on-prem or cloud stacks, including performance tuning, job observability, and production incident response. • Workflow Orchestration: Production experience orchestrating complex pipelines with Apache Airflow (or equivalent), including DAG design, robust dependency modeling, SLA management, and operational excellence. • Lakehouse & Storage Formats: Expertise with data lakehouse technologies (Apache Iceberg, Delta Lake, Hudi) and columnar storage formats (Parquet, ORC) for scalable, reliable data management. • Streaming & Messaging: Practical knowledge of event streaming patterns and tooling such as Kafka, Kinesis, or Pulsar for ingesting high-volume network telemetry. • Software Engineering Rigor: Strong foundation in Python, Scala, or Java; disciplined CI/CD, automated testing, infrastructure-as-code, and Git-based workflows. • Data Modeling in Context: Ability to design pragmatic schemas and semantic layers that serve ETL throughput, downstream analytics, and ML feature engineering. • Cloud-Native Platforms: Experience delivering pipelines on AWS, GCP, or Azure using services like EMR, Databricks, Glue, Dataflow, BigQuery, or equivalent. • Operational Resilience: Familiarity with Kubernetes, containerized deployments, and observability stacks (Prometheus, Grafana, ELK, Open Telemetry) for proactive monitoring, rapid recovery, and continuous improvement. Education & Experience: • Bachelor's, Master's, or PhD in Computer Science, Data Engineering, Electrical Engineering, or related technical field (or equivalent practical experience). • 8+ years delivering production ETL platforms and lakehouse datasets for large-scale systems, including ownership of business-critical workloads. • Experience working with large-scale telemetry data is a plus. • Familiarity with client's internal data platforms and development processes is preferred.