

Tek Leaders Inc
AI Data Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for an AI Data Engineer on a long-term W2 contract, requiring 12+ years in IT with 2-4 years in AI agents. Key skills include AWS, data pipelines, ETL, and data governance. Remote work location.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
March 7, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Remote
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
San Francisco Bay Area
-
🧠 - Skills detailed
#Datasets #Scala #Compliance #Data Quality #"ETL (Extract #Transform #Load)" #Triggers #AWS (Amazon Web Services) #Indexing #Deployment #Data Access #Batch #Databases #Observability #SaaS (Software as a Service) #Data Pipeline #Cloud #Classification #Data Governance #AI (Artificial Intelligence) #Data Engineering
Role description
AI Data Engineer
Remote
Long term Contract on W2
Min 12+ into IT Exp required with min 2 to 4 years to AI Agents across cloud, MCP,
Tasks
• Design and build scalable data pipelines for AI agents across cloud platforms
• Create and maintain agent‑ready data models, schemas, and data contracts
• Build and operate vector data pipelines (data prep, chunking, embeddings, indexing, re‑indexing)
• Integrate structured, semi‑structured, and unstructured data sources for agent consumption
• Develop MCP (Model Context Protocol) data adapters/connectors for databases, APIs, SaaS, files, and streams
• Define standard MCP request/response schemas and transformation logic
• Integrate MCPs with the MCP gateway (auth, routing, throttling, observability)
• Build CI/CD pipelines for MCP build, test, deployment, and rollback
• Implement CI/CD pipelines for data pipelines, datasets, and vector stores
• Automate environment promotion (dev/test/prod) for data assets
• Embed data quality checks (schema validation, freshness, completeness) into pipelines
• Design and operate real‑time streaming pipelines (event ingestion, enrichment, aggregation)
• Enable event‑driven data triggers for AI agents
• Build batch + streaming hybrid architectures for historical and real‑time context
• Develop and maintain certified data connectors for Low‑Code / No‑Code platforms
• Standardize enterprise data models for reuse by agents and citizen developers
• Manage secure data access using RBAC, managed identities, secrets, and tokenization
• Monitor data quality, drift, and freshness impacting agent behavior
• Implement data observability and lineage tracking across pipelines and MCPs
• Enforce data governance, classification, and compliance controls
• Optimize data performance, latency, and cost for agent workloads
• Experience developing these using AWS cloud services and open source
AI Data Engineer
Remote
Long term Contract on W2
Min 12+ into IT Exp required with min 2 to 4 years to AI Agents across cloud, MCP,
Tasks
• Design and build scalable data pipelines for AI agents across cloud platforms
• Create and maintain agent‑ready data models, schemas, and data contracts
• Build and operate vector data pipelines (data prep, chunking, embeddings, indexing, re‑indexing)
• Integrate structured, semi‑structured, and unstructured data sources for agent consumption
• Develop MCP (Model Context Protocol) data adapters/connectors for databases, APIs, SaaS, files, and streams
• Define standard MCP request/response schemas and transformation logic
• Integrate MCPs with the MCP gateway (auth, routing, throttling, observability)
• Build CI/CD pipelines for MCP build, test, deployment, and rollback
• Implement CI/CD pipelines for data pipelines, datasets, and vector stores
• Automate environment promotion (dev/test/prod) for data assets
• Embed data quality checks (schema validation, freshness, completeness) into pipelines
• Design and operate real‑time streaming pipelines (event ingestion, enrichment, aggregation)
• Enable event‑driven data triggers for AI agents
• Build batch + streaming hybrid architectures for historical and real‑time context
• Develop and maintain certified data connectors for Low‑Code / No‑Code platforms
• Standardize enterprise data models for reuse by agents and citizen developers
• Manage secure data access using RBAC, managed identities, secrets, and tokenization
• Monitor data quality, drift, and freshness impacting agent behavior
• Implement data observability and lineage tracking across pipelines and MCPs
• Enforce data governance, classification, and compliance controls
• Optimize data performance, latency, and cost for agent workloads
• Experience developing these using AWS cloud services and open source






