Validate AI agents against production-level quality criteria with 0-100 scoring.
Use when evaluating agent quality, identifying bugs/gaps, or improving agents to expert level.
Evaluates agents across 9 categories: structure, role definition, methodology, user interaction,
quality standards, context management, technical robustness, pedagogical effectiveness (tutors),
and production readiness (operators). Returns actionable validation report with specific improvements.