Tek Leaders Inc

AI Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for an AI Data Engineer on a long-term W2 contract, requiring 12+ years in IT with 2-4 years in AI agents. Key skills include AWS, data pipelines, ETL, and data governance. Remote work location.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
March 7, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Remote
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
San Francisco Bay Area
-
🧠 - Skills detailed
#Datasets #Scala #Compliance #Data Quality #"ETL (Extract #Transform #Load)" #Triggers #AWS (Amazon Web Services) #Indexing #Deployment #Data Access #Batch #Databases #Observability #SaaS (Software as a Service) #Data Pipeline #Cloud #Classification #Data Governance #AI (Artificial Intelligence) #Data Engineering
Role description
AI Data Engineer Remote Long term Contract on W2 Min 12+ into IT Exp required with min 2 to 4 years to AI Agents across cloud, MCP, Tasks • Design and build scalable data pipelines for AI agents across cloud platforms • Create and maintain agent‑ready data models, schemas, and data contracts • Build and operate vector data pipelines (data prep, chunking, embeddings, indexing, re‑indexing) • Integrate structured, semi‑structured, and unstructured data sources for agent consumption • Develop MCP (Model Context Protocol) data adapters/connectors for databases, APIs, SaaS, files, and streams • Define standard MCP request/response schemas and transformation logic • Integrate MCPs with the MCP gateway (auth, routing, throttling, observability) • Build CI/CD pipelines for MCP build, test, deployment, and rollback • Implement CI/CD pipelines for data pipelines, datasets, and vector stores • Automate environment promotion (dev/test/prod) for data assets • Embed data quality checks (schema validation, freshness, completeness) into pipelines • Design and operate real‑time streaming pipelines (event ingestion, enrichment, aggregation) • Enable event‑driven data triggers for AI agents • Build batch + streaming hybrid architectures for historical and real‑time context • Develop and maintain certified data connectors for Low‑Code / No‑Code platforms • Standardize enterprise data models for reuse by agents and citizen developers • Manage secure data access using RBAC, managed identities, secrets, and tokenization • Monitor data quality, drift, and freshness impacting agent behavior • Implement data observability and lineage tracking across pipelines and MCPs • Enforce data governance, classification, and compliance controls • Optimize data performance, latency, and cost for agent workloads • Experience developing these using AWS cloud services and open source