O2 Technologies,Inc

Perception Data Pipeline Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Perception Data Pipeline Engineer with a contract length of "unknown" and a pay rate of "unknown." It requires onsite work in Foster City, CA, at least 3 days a week. Key skills include Python, C++, PySpark, and experience in autonomous vehicles. A Bachelor's degree in a related field is required.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

May 28, 2026

🕒 - Duration

Unknown

🏝️ - Location

On-site

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Redwood City, CA

🧠 - Skills detailed

#Lambda (AWS Lambda) #ML Ops (Machine Learning Operations) #Data Engineering #Data Pipeline #PySpark #Python #Monitoring #Computer Science #REST (Representational State Transfer) #AWS (Amazon Web Services) #Observability #S3 (Amazon Simple Storage Service) #Documentation #API (Application Programming Interface) #Docker #REST API #Databricks #C++ #Spark (Apache Spark) #Classification #AI (Artificial Intelligence) #ML (Machine Learning) #AWS S3 (Amazon Simple Storage Service)

Role description

About The Role Software Engineer, Perception Attributes Autolabeling Pipeline Onsite in Foster City, CA | at least 3 days in office The Perception Attribute Flywheel team is looking for a Software Engineer to build and operate the autolabeling pipeline that accelerates human annotation throughput on vehicle attribute classification tasks. Zoox is building a future for Riders, not drivers. The accuracy of our perception attribute models — recognizing emergency vehicles, school buses, brake lights, hazard signals, and more — depends on a steady flow of high-quality labeled examples drawn from our fleet's drive data. Today, every label is produced by a human annotator from scratch. We are building a pipeline that uses off-the-shelf foundation models (Gemini, SigLIP, CLIP) to pre-label tasks, so human reviewers verify and correct rather than labeling from scratch. This role owns the pipeline engineering for that system: ingesting queued tasks from our annotator service, calling foundation-model APIs at fleet scale, writing structured predictions back into the labeling workflow, and operating the whole thing reliably. The team lead and supporting ML engineers own model selection, prompt design, and evaluation methodology; this role partners closely with them but is not expected to own those decisions. If you take pride in building reliable, observable, well-tested data pipelines and want to ship a system that visibly accelerates an autonomous vehicle program, you will excel in this role. Responsibilities • Build the autolabeling pipeline: ingest queued tasks from the annotator service, dispatch them to foundation-model APIs (Gemini and others), parse structured outputs, and write pre-labels back to the labeling workflow • Build the observability layer: per-task latency, per-model cost, per-attribute coverage, error-mode dashboards • Run experiments designed by the team lead — set up the inputs, execute, collect outputs in formats the ML engineers can analyze • Integrate the pipeline cleanly with existing Zoox systems, partnering with the data infrastructure team • Document the system, write runbooks, and ensure a clean handoff at end of engagement Qualifications • 3+ years of backend / data pipeline engineering experience • Strong Python; comfort with C++ • Large-dataset experience with PySpark or equivalent • ML fundamentals — understanding of model inference, embeddings, structured output, and common eval metrics (precision, recall, calibration); able to reason about ML data shapes and integration patterns • Experience integrating foundation-models (Gemini, OpenAI, Anthropic) at production scale • Excellent written communication for design docs and runbooks Bonus Qualities — Experience With Any Of The Following • Databricks • End-to-end ML pipeline stewardship — owned an ML system in production from data ingest through inference through monitoring • Annotation tooling or human-in-the-loop ML workflows • Autonomous-systems data pipelines • AWS, especially S3, ECS/EKS, Lambda • Working in a codebase shared with ML engineers (proto schemas, joint deploys) Key Responsibilities & Skills • Autolabeling Pipeline Development • Vehicle Attribute Classification Data Flow • Human-in-the-Loop Annotation Acceleration • ML Model Inference Integration • Observability & Monitoring of Data Pipelines • Experimentation Support for ML Teams • Documentation & Runbook Creation • Cross-Team Integration with Data Infrastructure Technical Skills • Python • C++ • PySpark / Spark • AWS (S3 / ECS / EKS / Lambda) • Databricks • Foundation Model APIs (Gemini / OpenAI / Anthropic) • REST API Integration • Docker / Containerization • Observability Dashboards Education Bachelor's Degree in Computer Science, Software Engineering, Electrical Engineering, Computer Engineering. Preferred: Master's in Computer Science, Master's in Artificial Intelligence, Master's in Machine Learning, PhD in Computer Science. Industry Experience • Autonomous Vehicles • Autonomous Driving • Automotive • Computer Vision • Machine Learning Operations (MLOps) • Data Engineering for AV #CareerOpportunities #JobVacancy #WorkWithUs

Apply now Apply with DFH

Business Intelligence/Data Reporting Analyst

This role is for a Business Intelligence/Data Reporting Analyst in Reston, Virginia, with a contract length of unspecified duration. Requires an active TS/SCI clearance with polygraph, 5+ years of business analysis experience, and proficiency in Microsoft Power BI, SSRS, and SQL.

O2 Technologies,Inc

Perception Data Pipeline Engineer

Business Intelligence/Data Reporting Analyst

Data Engineer

Machine Learning Engineer

Affiliate Tech Lead - 5 Month Contract

Book a

chat

with us

Company