

Braintrust
Principal Simulation & Reliability Architect (Contract-to-Hire)
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Principal Simulation & Reliability Architect (Contract-to-Hire) focusing on architecting simulation systems for multi-step AI workflows. Requires 6+ years in ML or simulation systems, strong Python skills, and familiarity with regulated domains. 20 hours/week for 4–6 months, fully remote.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
December 11, 2025
🕒 - Duration
3 to 6 months
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Monitoring #Datadog #Libraries #ML (Machine Learning) #Data Science #Debugging #AI (Artificial Intelligence) #Grafana #Documentation #Observability #Python #Regression #Deployment
Role description
Job Description
About the Company
Reins AI partners with organizations in high-stakes, regulated domains to evaluate and improve the reliability of complex AI systems. Our simulation platform, Simthetic AI, models multi-step agentic workflows so teams can test, monitor, and improve system behavior before deployment.
This role sits within Reins AI and works primarily on the architecture and development of Simthetic AI’s simulation and reliability systems.
Role Overview
We’re looking for a senior architect to design and prototype the simulation, evaluation, and observability frameworks that support reliable multi-step agentic AI systems. This role combines systems architecture with hands-on engineering: modeling agent workflows, surfacing failure modes, and building internal tools that form the foundation of our platform.
You’ll work closely with the founder, data scientist, and synthetic data team to align simulation capabilities with evaluation, monitoring, and triage workflows.
Responsibilities
• Architect modular simulation environments for multi-step agent workflows
• Model interactions among agents, tools, and document flows
• Prototype components that reveal behavior, edge cases, and failure patterns
• Define evaluation patterns (task success, factuality, adherence to procedure, suitability)
• Build regression, validation, and inspection tooling for simulation output
• Identify key events and metrics for monitoring and triage workflows
• Integrate simulations with modern observability tools (OpenTelemetry, Arize, Grafana)
• Design trace schemas and system health signals
• Establish architectural patterns, frameworks, and documentation for future engineers
• Contribute to the technical roadmap for Simthetic AI's simulation and reliability platform
Required Qualifications
• 6+ years building or architecting complex ML, simulation, workflow, or observability systems
• Strong Python engineering fundamentals; ability to build internal libraries or frameworks
• Experience designing abstractions and end-to-end architectures
• Familiarity with multi-step AI workflows or agentic patterns
• Strong debugging intuition and systems-thinking mindset
• Excellent communication skills; comfort working in a fast-paced founder-led environment
Preferred Skills
• Experience with simulation frameworks, synthetic data, or agent evaluation systems
• Background in reliability engineering, monitoring, or triage workflows
• Experience in regulated domains (audit, finance, healthcare)
• Knowledge of distributed systems or ML pipeline design
• Familiarity with agentic frameworks (LangGraph, Semantic Kernel, CrewAI)
• Experience with observability platforms (OpenTelemetry, Arize, Grafana, Datadog)
Contract Structure & Duration
• 20 hours/week for 4–6 months
• Contract-to-hire with a clear path to full-time
• Competitive hourly rate (commensurate with seniority and experience)
• Fully remote, flexible hours
Job Description
About the Company
Reins AI partners with organizations in high-stakes, regulated domains to evaluate and improve the reliability of complex AI systems. Our simulation platform, Simthetic AI, models multi-step agentic workflows so teams can test, monitor, and improve system behavior before deployment.
This role sits within Reins AI and works primarily on the architecture and development of Simthetic AI’s simulation and reliability systems.
Role Overview
We’re looking for a senior architect to design and prototype the simulation, evaluation, and observability frameworks that support reliable multi-step agentic AI systems. This role combines systems architecture with hands-on engineering: modeling agent workflows, surfacing failure modes, and building internal tools that form the foundation of our platform.
You’ll work closely with the founder, data scientist, and synthetic data team to align simulation capabilities with evaluation, monitoring, and triage workflows.
Responsibilities
• Architect modular simulation environments for multi-step agent workflows
• Model interactions among agents, tools, and document flows
• Prototype components that reveal behavior, edge cases, and failure patterns
• Define evaluation patterns (task success, factuality, adherence to procedure, suitability)
• Build regression, validation, and inspection tooling for simulation output
• Identify key events and metrics for monitoring and triage workflows
• Integrate simulations with modern observability tools (OpenTelemetry, Arize, Grafana)
• Design trace schemas and system health signals
• Establish architectural patterns, frameworks, and documentation for future engineers
• Contribute to the technical roadmap for Simthetic AI's simulation and reliability platform
Required Qualifications
• 6+ years building or architecting complex ML, simulation, workflow, or observability systems
• Strong Python engineering fundamentals; ability to build internal libraries or frameworks
• Experience designing abstractions and end-to-end architectures
• Familiarity with multi-step AI workflows or agentic patterns
• Strong debugging intuition and systems-thinking mindset
• Excellent communication skills; comfort working in a fast-paced founder-led environment
Preferred Skills
• Experience with simulation frameworks, synthetic data, or agent evaluation systems
• Background in reliability engineering, monitoring, or triage workflows
• Experience in regulated domains (audit, finance, healthcare)
• Knowledge of distributed systems or ML pipeline design
• Familiarity with agentic frameworks (LangGraph, Semantic Kernel, CrewAI)
• Experience with observability platforms (OpenTelemetry, Arize, Grafana, Datadog)
Contract Structure & Duration
• 20 hours/week for 4–6 months
• Contract-to-hire with a clear path to full-time
• Competitive hourly rate (commensurate with seniority and experience)
• Fully remote, flexible hours






