MuleRun
Chat
Resources
Pricing
Back to all
agent-evaluation-framework
by
Mahdi Khan
Run
0
0
Feb 7, 2026
Visit Source
Three-layer evaluation for agentic systems — component testing, trajectory evaluation, and outcome evaluation with LLM-as-Judge. Includes golden trajectories, the Four C's, and the evaluation flywheel. Use when building test suites for agents.