Back to all

agent-evaluation-framework

by Mahdi Khan

00Feb 7, 2026Visit Source
Three-layer evaluation for agentic systems — component testing, trajectory evaluation, and outcome evaluation with LLM-as-Judge. Includes golden trajectories, the Four C's, and the evaluation flywheel. Use when building test suites for agents.