HireTalent - Diversity Staffing & Recruiting Firm

Generative AI Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Generative AI Engineer based in Rockville, MD, on a 12+ month contract, offering a competitive pay rate. Key skills include data engineering with Apache Spark, SQL optimization, and experience with LLM-powered agent systems.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
May 12, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Rockville, MD
-
🧠 - Skills detailed
#Presto #GitHub #S3 (Amazon Simple Storage Service) #Logging #Monitoring #Prometheus #Data Lake #Debugging #Datasets #Documentation #Langchain #Terraform #Data Pipeline #OpenSearch #ChatGPT #Apache Spark #Docker #Cloud #Infrastructure as Code (IaC) #Big Data #Grafana #Spark (Apache Spark) #Observability #Data Engineering #Programming #AWS S3 (Amazon Simple Storage Service) #EC2 #Databases #Lambda (AWS Lambda) #Trino #AI (Artificial Intelligence) #AWS (Amazon Web Services) #GitLab #SageMaker #SQL Queries #Data Quality #Complex Queries #"ETL (Extract #Transform #Load)" #Scala #Jenkins #API (Application Programming Interface) #Anomaly Detection #Dataiku #PySpark #Data Catalog #SQL (Structured Query Language) #Kubernetes #Python #Automation
Role description
Title: Generative AI Engineer Location: Rockville, MD/Hybrid Duration: 12+ months Contract Project Description The Generative AI Engineer works with moderate supervision across two equally weighted domains: (1) large-scale data pipeline development processing market events in a cloud environment, and (2) design and development of agentic AI systems including LLM-powered regulatory data assistants, MCP servers, and agent harness architectures. This position contributes to overall product quality throughout the software development lifecycle. Responsibilities • Build and maintain ETL/ELT pipelines using Apache Spark, Hive, and Trino across S3-based data lake environments • Develop and optimize SQL for large-scale surveillance datasets including window functions, multi-table joins, and complex aggregations • Build and engineer big data systems (EMR-on-EC2, EMR-on-EKS) and develop solutions on analytical platforms (SageMaker, Domino, Dataiku) • Participate in data quality monitoring, anomaly detection, and production incident investigation • Develop AI agent systems using AWS Bedrock and agent frameworks (Strands Agents SDK, LangChain/LangGraph, or equivalent) • Build agent harness architectures combining LLM reasoning with deterministic execution - skill/RAG-based SQL generation and structured output validation • Implement agent memory, context management, and tool integration (MCP servers, API connectors, data catalog lookups) across the data lake • Build evaluation frameworks for agent accuracy - paraphrase robustness, routing precision, and structural consistency • Stay informed of advances in LLM frameworks (LangGraph, Google ADK, AWS Strands) and emerging AI capabilities • Write clean, well-tested code; contribute to CI/CD Jenkins pipelines and infrastructure-as-code on AWS • Ensure secure handling of RCI and sensitive regulatory data across both data pipelines and agent outputs - auditable execution traces • Adhere to • • • and team standards for secure development practices and technology policies • Partner across teams, communicate technical information at the appropriate level, and maintain documentation on Confluence/Wiki • Actively learn from senior team members; contribute to process improvement in line with • • • 's values of collaboration, expertise, innovation, and responsibility • Essential Technical Skills • Data Engineering & Big Data Technologies • Experience building data pipelines using Apache Spark (PySpark preferred) and SQL • Experience with SQL query engines (Hive, Trino/Presto, or similar) and cloud data platforms (AWS S3, EMR, Lambda) • Understanding of common issues like data skew and strategies to mitigate it, working with large data volumes, and troubleshooting job failures due to resource limitations, bad data, and scalability challenges • Real-world experience with debugging and mitigation strategies • Generative AI & Agentic Systems • Practical experience building LLM-powered agent systems that use tools and produce structured outputs (not just chatbot interfaces) • Hands-on experience with at least one agent framework: LangChain, LangGraph, AWS Strands, or equivalent • Working knowledge of prompt engineering, RAG architectures, and context/memory management • Experience with foundation model APIs (Anthropic Claude, Amazon Nova, OpenAI, or similar) • Memory Architecture: Understanding of agent memory tiers - working memory, episodic memory, semantic memory - and strategies for context persistence, pruning, and retrieval across sessions • Agent Harness Design: Familiarity with harness patterns that wrap LLM reasoning with deterministic guardrails, tool routing, verification loops, and graceful degradation • AI Tool Proficiency • Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.) • Experience with spec-driven development - using structured specifications to guide AI code generation, review, and validation • Ability to leverage AI pair programming for code suggestions, debugging, refactoring, and automated test generation • Cloud Technologies • Experience with AWS services like S3, EMR, EMR on EKS, Lambda, Bedrock, Step Functions, etc. • Hands-on experience using S3 with Spark (e.g., dealing with file formats, consistency issues) • Familiarity with AWS Bedrock for foundation model invocation, knowledge bases, guardrails, and agent orchestration • Exposure to Google Cloud Vertex AI (model garden, grounding, agent builder) or equivalent managed AI platforms • Familiarity with AWS monitoring and logging tools (CloudWatch, CloudTrail) for production workloads • Programming – Python • Proficiency in Python for data engineering and automation • Ability to write clean, modular, and performant code • Experience with functional programming concepts (e.g., immutability, higher-order functions) • Strong understanding of collections, concurrency, and memory management • SQL Skills (Window Functions, Joins, Complex Queries) • Proficiency with SQL window functions, multi-table joins, and aggregations • Ability to write and optimize complex SQL queries • Experience handling edge cases like NULLs, duplicates, and ordering • Good to Have • AWS Bedrock AgentCore (memory, identity, tool gateway) • Model Context Protocol (MCP) server development and integration • Agent evaluation harnesses and agentic patterns (draft-verification, compile-style generation) • Fine-tuning foundation models for domain-specific tasks (LoRA, PEFT, or managed fine-tuning via Bedrock/Vertex AI) • Local model execution with Ollama, vLLM, or similar for development and experimentation • Vector databases (FAISS, Pinecone, OpenSearch) • Docker, Kubernetes, and Amazon EKS for containerized workloads • Infrastructure as Code (Terraform, CloudFormation) • Experience with CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, ArgoCD) • Experience with monitoring and observability tools (Prometheus, Grafana, ELK stack) • AWS certifications (AI Practitioner, Solutions Architect, or Kubernetes certifications like CKA/CKAD)