agent-evaluation-mlflow

Name: agent-evaluation-mlflow
Brand: MuleRun
Author: Raphael MANSUY

by Raphael MANSUY

20Feb 7, 2026Visit Source

Implement agent evaluation and safety gates using MLflow 3.x. Use for creating LLM-as-Judge scorers, evaluation datasets, quality gates, tracing, and continuous evaluation. Triggers on "evaluate agent", "MLflow scorer", "LLM judge", "safety evaluation", "quality gate", "agent testing", "hallucination detection", or when implementing spec/010-agent-evaluation.md requirements.