JSG (Johnson Service Group, Inc.)

Lead AI QA

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Lead AI QA with a contract length of "unknown," offering a pay rate of "unknown." It requires 7+ years in Generative AI testing, 5+ years with LLMs, and expertise in Azure AI Foundry, automation frameworks, and responsible AI practices. Remote work in the USA only.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
February 19, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Boston, MA
-
🧠 - Skills detailed
#System Testing #"ETL (Extract #Transform #Load)" #Scala #Deployment #Compliance #Automation #Azure #Automated Testing #Monitoring #Strategy #Quality Assurance #AI (Artificial Intelligence)
Role description
Our client is seeking a Senior-level AI Quality Assurance Leader specializing in Generative AI, LLM systems, and AI agents, responsible for defining and driving end-to-end quality strategy for scalable and responsible AI deployments. (Remote, USA Only) Must Have Skills • Generative AI (GenAI) system testing – 7+ years overall QA experience with hands-on GenAI validation • Large Language Models (LLMs) – 5+ years experience validating LLM outputs for accuracy, safety, and bias • Retrieval-Augmented Generation (RAG) systems – 5+ years experience testing pipeline performance and retrieval quality • Azure AI Foundry – 3+ years experience testing Azure-based AI solutions • LangGraph – 3+ years experience validating orchestration and multi-agent workflows • Test Automation Frameworks for AI Systems – 5+ years experience building automated validation and evaluation pipelines • LangSmith evaluation workflows – 3+ years experience in LLM evaluation and monitoring • Multi-agent AI architectures – 3+ years experience testing agent coordination and decision logic • Responsible AI practices – 5+ years experience in bias detection, safety validation, and compliance testing • Performance and load testing for AI systems – 5+ years experience validating scalability of AI services • CI/CD integration for AI pipelines – 5+ years experience embedding QA into deployment workflows • AI evaluation metrics design (BLEU, ROUGE, custom scoring, etc.) – 3+ years experience defining quality benchmarks Responsibilities • Define and lead QA strategy for Generative AI pipelines, RAG systems, and multi-agent workflows • Validate LLM outputs for accuracy, safety, bias, and performance across environments • Oversee quality assurance of Azure AI Foundry-based AI solutions • Ensure quality across LangGraph orchestration and LangSmith evaluation workflows • Establish automated testing frameworks and AI-specific evaluation metrics • Lead QA teams to ensure scalable, reliable, and responsible AI deployments