# skill-eval

> Evaluate all workspace-hub skills for structural validity, content quality, cross-reference integrity, and registry consistency. Runs 18 checks across critical, warning, and info severity levels with actionable fix suggestions.

- Author: Vamsee Achanta
- Repository: vamseeachanta/workspace-hub
- Version: 20260205082412
- Stars: 1
- Forks: 0
- Last Updated: 2026-02-07
- Source: https://github.com/vamseeachanta/workspace-hub
- Web: https://mule.run/skillshub/@@vamseeachanta/workspace-hub~skill-eval:20260205082412

---

---
name: skill-eval
description: Evaluate all workspace-hub skills for structural validity, content quality, cross-reference integrity, and registry consistency. Runs 18 checks across critical, warning, and info severity levels with actionable fix suggestions.
version: 1.0.0
category: development
last_updated: 2026-01-29
capabilities:
  - structural_validation
  - content_quality_analysis
  - cross_reference_integrity
  - report_generation
tools:
  - Bash
  - Read
  - Glob
  - Grep
related_skills:
  - skill-creator
  - compliance-check
  - repository-health-analyzer
  - verification-loop
---

# Skill Eval

## Quick Start

```bash
# Full evaluation of all skills
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py

# JSON output
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py --format json

# Single skill
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py --skill testing-tdd-london

# Only critical issues
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py --severity critical
```

## When to Use

- Auditing skill quality after bulk creation or migration
- Pre-release validation of the skills library
- CI/CD quality gate for skill changes
- Identifying broken cross-references after renaming skills
- Checking compliance with v2 SKILL.md template format

## Core Concepts

### Evaluation Dimensions

The evaluator runs 18 checks organized into three severity levels:

**Critical** (blocks skill usage):
- YAML frontmatter exists and parses
- Required fields present: `name`, `description`

**Warning** (degrades quality):
- Version follows semver, category matches directory
- Required content sections present (Quick Start, When to Use, etc.)
- Code blocks in key sections, no TODO/FIXME markers
- Cross-references in `related_skills` resolve to real skills

**Info** (improvement opportunities):
- Uses v2 template format, has optional sections (Metrics, etc.)
- No duplicate skill names across the library

### Report Output

Reports include:
- Summary with pass/fail counts and percentages
- Issues grouped by severity with counts
- Per-category breakdown
- Top 10 most common issues
- Per-skill details with actionable fix suggestions

## Usage Examples

### Full Evaluation

```bash
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py
```

Output:
```
================================================================
  SKILL EVALUATION REPORT
  Generated: 2026-01-29T14:30:00+00:00
================================================================

SUMMARY
----------------------------------------
  Total skills evaluated:  230
  Passed (no critical):    187  (81.3%)
  Warnings only:           102  (44.3%)
  Critical failures:        43  (18.7%)
```

### JSON for CI/CD

```bash
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py \
  --format json --severity critical \
  --output reports/skill-eval.json
```

### Filter by Category

```bash
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py --category development
```

### Summary Only

```bash
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py --summary-only
```

## Best Practices

- Run after creating new skills with `/skill-creator` to validate structure
- Use `--format json` in CI pipelines for machine-readable output
- Address critical issues first (missing frontmatter, invalid YAML)
- Use `--severity warning` to focus on actionable improvements
- Run `--category` filters for focused audits of specific skill areas

## Error Handling

| Exit Code | Meaning |
|-----------|---------|
| 0 | All skills pass (no critical issues) |
| 1 | Critical failures found |
| 2 | Script error (missing directory, invalid arguments) |

| Common Issue | Cause | Fix |
|-------------|-------|-----|
| `frontmatter_missing` | Skill uses legacy heading format | Add `---` delimited YAML frontmatter |
| `yaml_invalid` | Syntax error in frontmatter | Fix YAML syntax (check colons, indentation) |
| `related_skill_unresolved` | Referenced skill name doesn't exist | Correct the name or remove the reference |
| `section_missing` | Missing required H2 section | Add the section heading and content |

## Metrics & Success Criteria

| Metric | Target |
|--------|--------|
| Skills passing all critical checks | 100% |
| Skills with complete v2 sections | >80% |
| Resolved cross-references | 100% |
| No TODO/FIXME in skills | 100% |

## Version History

- **1.0.0** (2026-01-29): Initial release with 18 checks, human + JSON output, category/skill filtering