agent-evaluation

Name: agent-evaluation
Brand: MuleRun
Author: Francis Dungca

by Francis Dungca

00Feb 7, 2026Visit Source

Test and evaluate LangGraph agents systematically. Covers dataset creation, custom evaluators, LLM-as-judge patterns with Gemini, and automated benchmarking. Use when building evaluation pipelines, comparing model versions, or measuring agent quality.