OSI Engineering

Senior Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Engineer on a 12-month contract based in Sunnyvale or San Diego, CA, with a pay rate of $75.00 - $90.00. Requires 8+ years in ETL platforms, expertise in Apache Spark, Airflow, and data lakehouse technologies.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
720
-
🗓️ - Date
November 13, 2025
🕒 - Duration
More than 6 months
-
🏝️ - Location
On-site
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
Sunnyvale, CA
-
🧠 - Skills detailed
#Anomaly Detection #Automated Testing #Deployment #Cloud #Apache Spark #Libraries #Apache Iceberg #Metadata #Compliance #ML (Machine Learning) #BigQuery #Dataflow #GIT #Data Engineering #Computer Science #Data Lineage #Scala #Delta Lake #Data Layers #Kubernetes #Observability #Storage #Apache Airflow #Documentation #Java #Data Lake #Data Lakehouse #Airflow #Data Management #GCP (Google Cloud Platform) #Prometheus #AWS (Amazon Web Services) #Azure #Batch #"ETL (Extract #Transform #Load)" #Python #Datasets #Security #Spark (Apache Spark) #Monitoring #Data Catalog #Grafana #Kafka (Apache Kafka) #Databricks
Role description
A globally leading consumer device company based in Cupertino, CA is looking for a Senior Data Engineer to join their team and help build the next generation of cellular analytics. You will work on production-grade ETL platforms that ingest, transform, and curate massive wireless telemetry datasets for near-real-time and batch use cases. Role and Responsibilities: • Design, implement, and operate resilient batch and streaming ETL jobs in Spark that process terabytes of cellular network data daily, with clear KPIs for latency and availability • Build Airflow DAGs with strong observability, retries, SLAs, and automated remediation to keep production data flowing • Develop reusable libraries, testing harnesses, and CI/CD workflows that enable rapid, safe deployments and empower partner teams to self-serve • Partner with ML engineers to publish feature-ready datasets and model monitoring telemetry that align with medallion best practices • Implement automated validation, anomaly detection, and reconciliation frameworks that ensure trustworthy data at scale • Instrument data lineage, metadata cataloging, and documentation workflows to support discovery and compliance requirements • Collaborate with platform & product teams, system engineers, researchers, and security teams. Required Skills and Experience: • 8+ years delivering production ETL platforms and lakehouse datasets for large-scale systems, including ownership of business-critical workloads • Proven experience architecting, operating, and continuously scaling petabyte-class ETL/ELT platforms that power mission-critical analytics and ML workloads across bronze/silver/gold data layers • Ability to craft multi-year data platform roadmaps, drive architectural decisions, and align stakeholders around standards for quality, performance, and cost efficiency • Deep hands-on proficiency with Apache Spark (batch and structured streaming) on-prem or cloud stacks, including performance tuning, job observability, and production incident response • Production experience orchestrating complex pipelines with Apache Airflow (or equivalent), including DAG design, robust dependency modeling, SLA management, and operational excellence • Expertise with data lakehouse technologies (Apache Iceberg, Delta Lake, Hudi) and columnar storage formats (Parquet, ORC) for scalable, reliable data management • Practical knowledge of event streaming patterns and tooling such as Kafka, Kinesis, or Pulsar for ingesting high-volume network telemetry • Strong foundation in Python, Scala, or Java; disciplined CI/CD, automated testing, infrastructure-as-code, and Git-based workflows • Ability to design pragmatic schemas and semantic layers that serve ETL throughput, downstream analytics, and ML feature engineering • Experience delivering pipelines on AWS, GCP, or Azure using services like EMR, Databricks, Glue, Dataflow, BigQuery, or equivalent • Familiarity with Kubernetes, containerized deployments, and observability stacks (Prometheus, Grafana, ELK, OpenTelemetry) for proactive monitoring, rapid recovery, and continuous improvement • Experience working with large-scale telemetry data is a plus • Bachelor's degree or higher in Computer Science, Data Engineering, Electrical Engineering, or related technical field (or equivalent practical experience) Type: Contract Duration: 12 months with extension Work Location: Sunnyvale, CA or San Diego, CA (100% on site) Pay rate: $75.00 - $90.00 (DOE) No 3rd party agencies or C2C