

HireTalent - Diversity Staffing & Recruiting Firm
Generative AI Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Generative AI Engineer based in Rockville, MD, on a 12+ month contract, offering a competitive pay rate. Key skills include data engineering with Apache Spark, SQL optimization, and experience with LLM-powered agent systems.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
May 12, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Rockville, MD
-
🧠 - Skills detailed
#Presto #GitHub #S3 (Amazon Simple Storage Service) #Logging #Monitoring #Prometheus #Data Lake #Debugging #Datasets #Documentation #Langchain #Terraform #Data Pipeline #OpenSearch #ChatGPT #Apache Spark #Docker #Cloud #Infrastructure as Code (IaC) #Big Data #Grafana #Spark (Apache Spark) #Observability #Data Engineering #Programming #AWS S3 (Amazon Simple Storage Service) #EC2 #Databases #Lambda (AWS Lambda) #Trino #AI (Artificial Intelligence) #AWS (Amazon Web Services) #GitLab #SageMaker #SQL Queries #Data Quality #Complex Queries #"ETL (Extract #Transform #Load)" #Scala #Jenkins #API (Application Programming Interface) #Anomaly Detection #Dataiku #PySpark #Data Catalog #SQL (Structured Query Language) #Kubernetes #Python #Automation
Role description
Title: Generative AI Engineer
Location: Rockville, MD/Hybrid
Duration: 12+ months Contract
Project Description
The Generative AI Engineer works with moderate supervision across two equally weighted domains: (1) large-scale data pipeline development processing market events in a cloud environment, and (2) design and development of agentic AI systems including LLM-powered regulatory data assistants, MCP servers, and agent harness architectures. This position contributes to overall product quality throughout the software development lifecycle.
Responsibilities
• Build and maintain ETL/ELT pipelines using Apache Spark, Hive, and Trino across S3-based data lake environments
• Develop and optimize SQL for large-scale surveillance datasets including window functions, multi-table joins, and complex aggregations
• Build and engineer big data systems (EMR-on-EC2, EMR-on-EKS) and develop solutions on analytical platforms (SageMaker, Domino, Dataiku)
• Participate in data quality monitoring, anomaly detection, and production incident investigation
• Develop AI agent systems using AWS Bedrock and agent frameworks (Strands Agents SDK, LangChain/LangGraph, or equivalent)
• Build agent harness architectures combining LLM reasoning with deterministic execution - skill/RAG-based SQL generation and structured output validation
• Implement agent memory, context management, and tool integration (MCP servers, API connectors, data catalog lookups) across the data lake
• Build evaluation frameworks for agent accuracy - paraphrase robustness, routing precision, and structural consistency
• Stay informed of advances in LLM frameworks (LangGraph, Google ADK, AWS Strands) and emerging AI capabilities
• Write clean, well-tested code; contribute to CI/CD Jenkins pipelines and infrastructure-as-code on AWS
• Ensure secure handling of RCI and sensitive regulatory data across both data pipelines and agent outputs - auditable execution traces
• Adhere to
•
•
• and team standards for secure development practices and technology policies
• Partner across teams, communicate technical information at the appropriate level, and maintain documentation on Confluence/Wiki
• Actively learn from senior team members; contribute to process improvement in line with
•
•
• 's values of collaboration, expertise, innovation, and responsibility
• Essential Technical Skills
• Data Engineering & Big Data Technologies
• Experience building data pipelines using Apache Spark (PySpark preferred) and SQL
• Experience with SQL query engines (Hive, Trino/Presto, or similar) and cloud data platforms (AWS S3, EMR, Lambda)
• Understanding of common issues like data skew and strategies to mitigate it, working with large data volumes, and troubleshooting job failures due to resource limitations, bad data, and scalability challenges
• Real-world experience with debugging and mitigation strategies
• Generative AI & Agentic Systems
• Practical experience building LLM-powered agent systems that use tools and produce structured outputs (not just chatbot interfaces)
• Hands-on experience with at least one agent framework: LangChain, LangGraph, AWS Strands, or equivalent
• Working knowledge of prompt engineering, RAG architectures, and context/memory management
• Experience with foundation model APIs (Anthropic Claude, Amazon Nova, OpenAI, or similar)
• Memory Architecture: Understanding of agent memory tiers - working memory, episodic memory, semantic memory - and strategies for context persistence, pruning, and retrieval across sessions
• Agent Harness Design: Familiarity with harness patterns that wrap LLM reasoning with deterministic guardrails, tool routing, verification loops, and graceful degradation
• AI Tool Proficiency
• Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.)
• Experience with spec-driven development - using structured specifications to guide AI code generation, review, and validation
• Ability to leverage AI pair programming for code suggestions, debugging, refactoring, and automated test generation
• Cloud Technologies
• Experience with AWS services like S3, EMR, EMR on EKS, Lambda, Bedrock, Step Functions, etc.
• Hands-on experience using S3 with Spark (e.g., dealing with file formats, consistency issues)
• Familiarity with AWS Bedrock for foundation model invocation, knowledge bases, guardrails, and agent orchestration
• Exposure to Google Cloud Vertex AI (model garden, grounding, agent builder) or equivalent managed AI platforms
• Familiarity with AWS monitoring and logging tools (CloudWatch, CloudTrail) for production workloads
• Programming – Python
• Proficiency in Python for data engineering and automation
• Ability to write clean, modular, and performant code
• Experience with functional programming concepts (e.g., immutability, higher-order functions)
• Strong understanding of collections, concurrency, and memory management
• SQL Skills (Window Functions, Joins, Complex Queries)
• Proficiency with SQL window functions, multi-table joins, and aggregations
• Ability to write and optimize complex SQL queries
• Experience handling edge cases like NULLs, duplicates, and ordering
• Good to Have
• AWS Bedrock AgentCore (memory, identity, tool gateway)
• Model Context Protocol (MCP) server development and integration
• Agent evaluation harnesses and agentic patterns (draft-verification, compile-style generation)
• Fine-tuning foundation models for domain-specific tasks (LoRA, PEFT, or managed fine-tuning via Bedrock/Vertex AI)
• Local model execution with Ollama, vLLM, or similar for development and experimentation
• Vector databases (FAISS, Pinecone, OpenSearch)
• Docker, Kubernetes, and Amazon EKS for containerized workloads
• Infrastructure as Code (Terraform, CloudFormation)
• Experience with CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, ArgoCD)
• Experience with monitoring and observability tools (Prometheus, Grafana, ELK stack)
• AWS certifications (AI Practitioner, Solutions Architect, or Kubernetes certifications like CKA/CKAD)
Title: Generative AI Engineer
Location: Rockville, MD/Hybrid
Duration: 12+ months Contract
Project Description
The Generative AI Engineer works with moderate supervision across two equally weighted domains: (1) large-scale data pipeline development processing market events in a cloud environment, and (2) design and development of agentic AI systems including LLM-powered regulatory data assistants, MCP servers, and agent harness architectures. This position contributes to overall product quality throughout the software development lifecycle.
Responsibilities
• Build and maintain ETL/ELT pipelines using Apache Spark, Hive, and Trino across S3-based data lake environments
• Develop and optimize SQL for large-scale surveillance datasets including window functions, multi-table joins, and complex aggregations
• Build and engineer big data systems (EMR-on-EC2, EMR-on-EKS) and develop solutions on analytical platforms (SageMaker, Domino, Dataiku)
• Participate in data quality monitoring, anomaly detection, and production incident investigation
• Develop AI agent systems using AWS Bedrock and agent frameworks (Strands Agents SDK, LangChain/LangGraph, or equivalent)
• Build agent harness architectures combining LLM reasoning with deterministic execution - skill/RAG-based SQL generation and structured output validation
• Implement agent memory, context management, and tool integration (MCP servers, API connectors, data catalog lookups) across the data lake
• Build evaluation frameworks for agent accuracy - paraphrase robustness, routing precision, and structural consistency
• Stay informed of advances in LLM frameworks (LangGraph, Google ADK, AWS Strands) and emerging AI capabilities
• Write clean, well-tested code; contribute to CI/CD Jenkins pipelines and infrastructure-as-code on AWS
• Ensure secure handling of RCI and sensitive regulatory data across both data pipelines and agent outputs - auditable execution traces
• Adhere to
•
•
• and team standards for secure development practices and technology policies
• Partner across teams, communicate technical information at the appropriate level, and maintain documentation on Confluence/Wiki
• Actively learn from senior team members; contribute to process improvement in line with
•
•
• 's values of collaboration, expertise, innovation, and responsibility
• Essential Technical Skills
• Data Engineering & Big Data Technologies
• Experience building data pipelines using Apache Spark (PySpark preferred) and SQL
• Experience with SQL query engines (Hive, Trino/Presto, or similar) and cloud data platforms (AWS S3, EMR, Lambda)
• Understanding of common issues like data skew and strategies to mitigate it, working with large data volumes, and troubleshooting job failures due to resource limitations, bad data, and scalability challenges
• Real-world experience with debugging and mitigation strategies
• Generative AI & Agentic Systems
• Practical experience building LLM-powered agent systems that use tools and produce structured outputs (not just chatbot interfaces)
• Hands-on experience with at least one agent framework: LangChain, LangGraph, AWS Strands, or equivalent
• Working knowledge of prompt engineering, RAG architectures, and context/memory management
• Experience with foundation model APIs (Anthropic Claude, Amazon Nova, OpenAI, or similar)
• Memory Architecture: Understanding of agent memory tiers - working memory, episodic memory, semantic memory - and strategies for context persistence, pruning, and retrieval across sessions
• Agent Harness Design: Familiarity with harness patterns that wrap LLM reasoning with deterministic guardrails, tool routing, verification loops, and graceful degradation
• AI Tool Proficiency
• Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.)
• Experience with spec-driven development - using structured specifications to guide AI code generation, review, and validation
• Ability to leverage AI pair programming for code suggestions, debugging, refactoring, and automated test generation
• Cloud Technologies
• Experience with AWS services like S3, EMR, EMR on EKS, Lambda, Bedrock, Step Functions, etc.
• Hands-on experience using S3 with Spark (e.g., dealing with file formats, consistency issues)
• Familiarity with AWS Bedrock for foundation model invocation, knowledge bases, guardrails, and agent orchestration
• Exposure to Google Cloud Vertex AI (model garden, grounding, agent builder) or equivalent managed AI platforms
• Familiarity with AWS monitoring and logging tools (CloudWatch, CloudTrail) for production workloads
• Programming – Python
• Proficiency in Python for data engineering and automation
• Ability to write clean, modular, and performant code
• Experience with functional programming concepts (e.g., immutability, higher-order functions)
• Strong understanding of collections, concurrency, and memory management
• SQL Skills (Window Functions, Joins, Complex Queries)
• Proficiency with SQL window functions, multi-table joins, and aggregations
• Ability to write and optimize complex SQL queries
• Experience handling edge cases like NULLs, duplicates, and ordering
• Good to Have
• AWS Bedrock AgentCore (memory, identity, tool gateway)
• Model Context Protocol (MCP) server development and integration
• Agent evaluation harnesses and agentic patterns (draft-verification, compile-style generation)
• Fine-tuning foundation models for domain-specific tasks (LoRA, PEFT, or managed fine-tuning via Bedrock/Vertex AI)
• Local model execution with Ollama, vLLM, or similar for development and experimentation
• Vector databases (FAISS, Pinecone, OpenSearch)
• Docker, Kubernetes, and Amazon EKS for containerized workloads
• Infrastructure as Code (Terraform, CloudFormation)
• Experience with CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, ArgoCD)
• Experience with monitoring and observability tools (Prometheus, Grafana, ELK stack)
• AWS certifications (AI Practitioner, Solutions Architect, or Kubernetes certifications like CKA/CKAD)






