Teamware Solutions

LLM Evaluation Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for an LLM Evaluation Engineer, remote for 12+ months, offering competitive pay. Key skills include LLMs, AI evaluation methodologies, Python, and experience with evaluation tools. Strong understanding of AI safety and bias testing is essential.

🌎 - Country

United States

💱 - Currency

€ EUR

💰 - Day rate

Unknown

🗓️ - Date

December 17, 2025

🕒 - Duration

More than 6 months

🏝️ - Location

Remote

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

United States

🧠 - Skills detailed

#Data Analysis #AI (Artificial Intelligence) #API (Application Programming Interface) #Programming #Automation #Security #Python #Datasets #Batch

Role description

LLM Evaluation Engineer Location: Remote Duration: 12+ Months Required Skills • Strong understanding of LLMs and generative AI concepts, including model behavior and output evaluation • Experience with AI evaluation and benchmarking methodologies, including baseline creation and model comparison • Hands-on expertise in Eval testing, creating structured test suites to measure accuracy, relevance, safety, and performance • Ability to define and apply evaluation metrics (precisionrecall, BLEUROUGE, F1, hallucination rate, latency, cost per output)Prompt engineering and prompt testing experience across zero-shot, few-shot, and system prompt scenarios • Python other programming languages, for automation, data analysis, batch evaluation execution, and API integration • Experience with evaluation tools/frameworks (OpenAI Evals, HuggingFace evals, Promptfoo, Ragas, DeepEval, LM Eval Harness) • Ability to create datasets, test cases, benchmarks, and ground truth references for consistent scoring • Test design and test automation experience, including reproducible evaluation pipelines • Knowledge of AI safety, bias, security testing, and hallucination analysis

Apply now Apply with DFH Sign up

← See all roles

Go to role

Navitas Partners, LLC

is hiring for a:

Teamware Solutions

LLM Evaluation Engineer

Business Analyst 3 - 25-34182

Opt Offer Letter and Stem Payroll

Solution Architect – Salesforce Multi-Cloud Strategy

Cloud Engineer (AWS & Azure)

Book a

chat

with us

Company