

Gen AI Architect (Eval Framework)
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Gen AI Architect (Eval Framework) in Fremont, CA, with a contract length of "unknown" and a pay rate of "unknown." Requires 15 years of experience, expertise in Langfuse, Azure AI services, LLMOps, and proficiency in Python.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
September 24, 2025
π - Project duration
Unknown
-
ποΈ - Location type
On-site
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
Fremont, CA
-
π§ - Skills detailed
#Debugging #Observability #TypeScript #Scala #Azure #AI (Artificial Intelligence) #Python #Data Science #Cloud #Documentation
Role description
Hi,
I hope you are doing well.
We have an urgent below position .If you are interested, please share your updated resume with the rate expectation.
Role: Gen AI Architect (Eval Framework)
Location: Fremont, CA, USA
Experience: 15 years
Job description:
Mandatory skills:
Langfuse (including v3 features)
Evaluation SDK.
Azure AI services
LLMOps, prompt engineering, and GenAI lifecycle management.
Python
Skills:
Hands-on experience with Langfuse (including v3 features) and integrations.
Β· Experience with other GenAI observability tools (e.g., TruLens, W&B, Helicone).
Β· Knowledge of Retrieval-Augmented Generation (RAG), fine-tuning, and multi-agent orchestration.
Β· Strong understanding of Azure AI services, especially the Evaluation SDK.
Β· Deep expertise in LLMOps, prompt engineering, and GenAI lifecycle management.
Β· Proficiency in Python, TypeScript, or similar languages used in GenAI frameworks.
Β· Experience with cloud-native architectures (Azure preferred).
Β· Familiarity with Tracing tools, observability platforms, and evaluation metrics.
Β· Excellent communication and documentation skills.
Key Responsibilities:
Β· Set-up and deploy Langfuse v3 in production environment.
Β· Architect and implement the upgrade of Langfuse v2 to v3 within the LamBots framework, ensuring backward compatibility and performance optimization
Β· Design modular components for prompt management, tracing, metrics, evaluation, and playground features using Langfuse v3.
Β· Leverage Langfuseβs full feature set:
β Prompt Management β versioning, templating, and optimization
β Tracing β end-to-end visibility into GenAI workflows
β Metrics β performance, latency, and usage analytics
β Evaluation β automated and manual scoring of model outputs
β Playground β interactive testing and debugging of prompts
Β· Integrate Azure AI Evaluation SDK into LamBots to enable scalable enterprise-grade evaluation pipelines/workflows, including:
Β· Build reusable components and templates for evaluation across diverse GenAI use cases.
Β· Collaborate with cross-functional teams to integrate evaluation capabilities into production pipelines/ systems.
Β· Ensure scalability and reliability of evaluation tools in both offline and online environments.
Β· Define and enforce evaluation standards and best practices for GenAI agents, RAG pipelines, and multi-agent orchestration.
Β· Collaborate with product, engineering, and data science teams to align evaluation metrics with business KPIs.
Β· Drive observability, debugging, and traceability features for GenAI workflows.
Stay current with emerging GenAI evaluation tools, frameworks, and methodologies.
--
Thanks & Regards,
Anil Kumar
Raas Infotek Corporation.
262 Chapman Road, Suite 105A,
Newark, DE -19702
Direct No: 302-286-9932 Ext: 133
Email: anil.kumar@raasinfotek.com
Hi,
I hope you are doing well.
We have an urgent below position .If you are interested, please share your updated resume with the rate expectation.
Role: Gen AI Architect (Eval Framework)
Location: Fremont, CA, USA
Experience: 15 years
Job description:
Mandatory skills:
Langfuse (including v3 features)
Evaluation SDK.
Azure AI services
LLMOps, prompt engineering, and GenAI lifecycle management.
Python
Skills:
Hands-on experience with Langfuse (including v3 features) and integrations.
Β· Experience with other GenAI observability tools (e.g., TruLens, W&B, Helicone).
Β· Knowledge of Retrieval-Augmented Generation (RAG), fine-tuning, and multi-agent orchestration.
Β· Strong understanding of Azure AI services, especially the Evaluation SDK.
Β· Deep expertise in LLMOps, prompt engineering, and GenAI lifecycle management.
Β· Proficiency in Python, TypeScript, or similar languages used in GenAI frameworks.
Β· Experience with cloud-native architectures (Azure preferred).
Β· Familiarity with Tracing tools, observability platforms, and evaluation metrics.
Β· Excellent communication and documentation skills.
Key Responsibilities:
Β· Set-up and deploy Langfuse v3 in production environment.
Β· Architect and implement the upgrade of Langfuse v2 to v3 within the LamBots framework, ensuring backward compatibility and performance optimization
Β· Design modular components for prompt management, tracing, metrics, evaluation, and playground features using Langfuse v3.
Β· Leverage Langfuseβs full feature set:
β Prompt Management β versioning, templating, and optimization
β Tracing β end-to-end visibility into GenAI workflows
β Metrics β performance, latency, and usage analytics
β Evaluation β automated and manual scoring of model outputs
β Playground β interactive testing and debugging of prompts
Β· Integrate Azure AI Evaluation SDK into LamBots to enable scalable enterprise-grade evaluation pipelines/workflows, including:
Β· Build reusable components and templates for evaluation across diverse GenAI use cases.
Β· Collaborate with cross-functional teams to integrate evaluation capabilities into production pipelines/ systems.
Β· Ensure scalability and reliability of evaluation tools in both offline and online environments.
Β· Define and enforce evaluation standards and best practices for GenAI agents, RAG pipelines, and multi-agent orchestration.
Β· Collaborate with product, engineering, and data science teams to align evaluation metrics with business KPIs.
Β· Drive observability, debugging, and traceability features for GenAI workflows.
Stay current with emerging GenAI evaluation tools, frameworks, and methodologies.
--
Thanks & Regards,
Anil Kumar
Raas Infotek Corporation.
262 Chapman Road, Suite 105A,
Newark, DE -19702
Direct No: 302-286-9932 Ext: 133
Email: anil.kumar@raasinfotek.com