

JSG (Johnson Service Group, Inc.)
Lead AI QA
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Lead AI QA with a contract length of "unknown," offering a pay rate of "unknown." It requires 7+ years in Generative AI testing, 5+ years with LLMs, and expertise in Azure AI Foundry, automation frameworks, and responsible AI practices. Remote work in the USA only.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
February 19, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Boston, MA
-
🧠 - Skills detailed
#System Testing #"ETL (Extract #Transform #Load)" #Scala #Deployment #Compliance #Automation #Azure #Automated Testing #Monitoring #Strategy #Quality Assurance #AI (Artificial Intelligence)
Role description
Our client is seeking a Senior-level AI Quality Assurance Leader specializing in Generative AI, LLM systems, and AI agents, responsible for defining and driving end-to-end quality strategy for scalable and responsible AI deployments. (Remote, USA Only)
Must Have Skills
• Generative AI (GenAI) system testing – 7+ years overall QA experience with hands-on GenAI validation
• Large Language Models (LLMs) – 5+ years experience validating LLM outputs for accuracy, safety, and bias
• Retrieval-Augmented Generation (RAG) systems – 5+ years experience testing pipeline performance and retrieval quality
• Azure AI Foundry – 3+ years experience testing Azure-based AI solutions
• LangGraph – 3+ years experience validating orchestration and multi-agent workflows
• Test Automation Frameworks for AI Systems – 5+ years experience building automated validation and evaluation pipelines
• LangSmith evaluation workflows – 3+ years experience in LLM evaluation and monitoring
• Multi-agent AI architectures – 3+ years experience testing agent coordination and decision logic
• Responsible AI practices – 5+ years experience in bias detection, safety validation, and compliance testing
• Performance and load testing for AI systems – 5+ years experience validating scalability of AI services
• CI/CD integration for AI pipelines – 5+ years experience embedding QA into deployment workflows
• AI evaluation metrics design (BLEU, ROUGE, custom scoring, etc.) – 3+ years experience defining quality benchmarks
Responsibilities
• Define and lead QA strategy for Generative AI pipelines, RAG systems, and multi-agent workflows
• Validate LLM outputs for accuracy, safety, bias, and performance across environments
• Oversee quality assurance of Azure AI Foundry-based AI solutions
• Ensure quality across LangGraph orchestration and LangSmith evaluation workflows
• Establish automated testing frameworks and AI-specific evaluation metrics
• Lead QA teams to ensure scalable, reliable, and responsible AI deployments
Our client is seeking a Senior-level AI Quality Assurance Leader specializing in Generative AI, LLM systems, and AI agents, responsible for defining and driving end-to-end quality strategy for scalable and responsible AI deployments. (Remote, USA Only)
Must Have Skills
• Generative AI (GenAI) system testing – 7+ years overall QA experience with hands-on GenAI validation
• Large Language Models (LLMs) – 5+ years experience validating LLM outputs for accuracy, safety, and bias
• Retrieval-Augmented Generation (RAG) systems – 5+ years experience testing pipeline performance and retrieval quality
• Azure AI Foundry – 3+ years experience testing Azure-based AI solutions
• LangGraph – 3+ years experience validating orchestration and multi-agent workflows
• Test Automation Frameworks for AI Systems – 5+ years experience building automated validation and evaluation pipelines
• LangSmith evaluation workflows – 3+ years experience in LLM evaluation and monitoring
• Multi-agent AI architectures – 3+ years experience testing agent coordination and decision logic
• Responsible AI practices – 5+ years experience in bias detection, safety validation, and compliance testing
• Performance and load testing for AI systems – 5+ years experience validating scalability of AI services
• CI/CD integration for AI pipelines – 5+ years experience embedding QA into deployment workflows
• AI evaluation metrics design (BLEU, ROUGE, custom scoring, etc.) – 3+ years experience defining quality benchmarks
Responsibilities
• Define and lead QA strategy for Generative AI pipelines, RAG systems, and multi-agent workflows
• Validate LLM outputs for accuracy, safety, bias, and performance across environments
• Oversee quality assurance of Azure AI Foundry-based AI solutions
• Ensure quality across LangGraph orchestration and LangSmith evaluation workflows
• Establish automated testing frameworks and AI-specific evaluation metrics
• Lead QA teams to ensure scalable, reliable, and responsible AI deployments






