Intelliswift Software

Senior AI Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Senior AI Data Engineer in Menlo Park, CA, for 7 months at a competitive pay rate. Key skills include advanced SQL, ML integration, and experience with large-scale pipelines. Familiarity with embeddings and generative AI is preferred.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

720

🗓️ - Date

May 2, 2026

🕒 - Duration

More than 6 months

🏝️ - Location

On-site

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Menlo Park, CA

🧠 - Skills detailed

#ML (Machine Learning) #Storage #Code Reviews #Scala #AI (Artificial Intelligence) #Indexing #Data Engineering #Capacity Management #Data Pipeline #Compliance #SQL (Structured Query Language) #Data Quality #Datasets #Data Cleaning #Airflow #Classification #Batch #"ETL (Extract #Transform #Load)" #Debugging #Data Lifecycle

Role description

Job Title: Senior AI Data Engineer (Contract) Location: Menlo Park, CA Duration: 7 months (with potential for extension) As a Senior AI Data Engineer, you will design and operate end‑to‑end pipelines that not only move and transform data, but enrich it using ML models such as classifiers, embedding models, and large language models. The role sits at the intersection of data engineering and ML systems, requiring strong systems thinking around throughput, retries, async execution, and capacity management. You will work closely with engineers and researchers to support image generation and evaluation workflows, contributing directly to data quality, model performance, and scalability. Required Skills & Experience • Strong data engineering expertise, including advanced SQL, complex query optimization, and production pipeline orchestration (e.g., Airflow or equivalent) Hands‑on experience integrating ML inference into data pipelines, including: • Calling inference endpoints • Managing batching and throughput • Handling failures and retries at scale • Experience operating large-scale production pipelines with high reliability and performance requirements. • Proficiency using AI‑assisted coding tools to accelerate development, debugging, and code reviews. • Strong communication skills and ability to collaborate with engineers, researchers, and cross‑functional teams. Preferred Qualifications • Experience working with embeddings and vector search, including storage, indexing, and similarity queries. • Familiarity with content understanding models, such as image classification, OCR, safety or quality scoring. • Experience using LLMs for data workflows, including automated annotation, data cleaning, or evaluation tasks. • Knowledge of generative AI systems, particularly image generation and corresponding evaluation metrics. • Background working in data engineering, ML engineering, or hybrid roles that support model training or evaluation. Responsibilities • AI‑Augmented Data Pipelines: Design and maintain large‑scale data pipelines (up to billions of records/images) that combine SQL-based transformations with ML model inference for data cleaning, labeling, and enrichment. • Remote Inference Orchestration: Build and own systems that orchestrate remote model inference within pipelines, including batching, async execution, retries, fallback logic, and graceful degradation under load. • Feature & Embedding Pipelines: Develop scalable pipelines to generate, store, validate, and serve vector embeddings. Manage nearest‑neighbor indexes and ensure data quality at scale. • Data Curation at Scale: Source, filter, and curate training datasets using both structured queries and model‑derived signals (e.g., visual quality scores, content classification, safety filters). Own the end‑to‑end data lifecycle with a focus on quality, governance, and compliance. • LLM‑Assisted Annotation: Design pipelines that use large language models and vision models for automated data annotation. Create auditing workflows to evaluate and improve annotation quality. • Shared Tooling & Frameworks: Contribute reusable components and frameworks that simplify AI‑augmented data pipelines, such as standardized model‑invocation operators and async job orchestration patterns.

Apply now Apply with DFH

← See all roles

Go to role

Wokingham Borough Council

is hiring for a:

Intelliswift Software

Senior AI Data Engineer

Senior Performance Analyst Children's Social Care X 2

Data Analyst (Devolution & LGR)

SR. Security Architect

Robot Operator — Teleoperation & Data Collection

Book a

chat

with us

Company