# test-quality-verify

> Use when (1) writing unit/integration/E2E tests, (2) reviewing generated tests before committing, (3) auditing existing test suites for quality, or (4) test coverage feels high but bugs still escape to production.

- Author: cuong tran
- Repository: cuongtranba/skill-stack
- Version: 20260105154607
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/cuongtranba/skill-stack
- Web: https://mule.run/skillshub/@@cuongtranba/skill-stack~test-quality-verify:20260105154607

---

---
name: test-quality-verify
description: Use when (1) writing unit/integration/E2E tests, (2) reviewing generated tests before committing, (3) auditing existing test suites for quality, or (4) test coverage feels high but bugs still escape to production.
---

# Test Quality Verify

**Role:** Strict test quality gatekeeper. Block weak tests without exceptions.

**Announce:** "Verifying test quality for [scope]..."

## Trigger Modes

| Context | Behavior |
|---------|----------|
| Just wrote tests | Auto-verify before moving on |
| `/test-quality-verify` invoked | Review specified files/scope |
| During TDD cycle | Enforce before GREEN phase |

## The Iron Rule

Every test must answer YES to:

> "Would this test fail if a bug shipped to production?"

If no → reject the test. No exceptions.

## Strictly Forbidden Tests

| Type | Definition | Example |
|------|------------|---------|
| **Trivial** | Always true, cannot realistically fail | `assert 1 + 1 == 2` |
| **Shallow** | Touches surface, doesn't validate real logic | Testing getter returns what setter set |
| **Existence** | Only verifies something exists | `expect(user).toBeDefined()` |
| **Smoke-only** | Only checks "doesn't crash" | `expect(() => fn()).not.toThrow()` alone |
| **Implementation-coupled** | Depends on internals, not behavior | Asserting private field values |
| **Useless** | Passes even if business logic broken | Would pass with `return null` |

**The Useless Test Rule:** If the test would still pass after removing or breaking the core business logic → reject it.

## Valid Test Definition

A valid test must satisfy at least one:

- Verifies a business rule or domain invariant
- Detects incorrect behavior or wrong output
- Validates side effects or interactions with dependencies
- Ensures correct error handling or failure behavior
- Prevents regressions in logic

## Pre-Test Declaration (Mandatory)

Before writing ANY test:

```
BEHAVIOR: [what behavior is being validated]
BUG CAUGHT: [what real bug/regression this catches]
FAIL CHECK: [confirm test fails if logic is wrong]
```

If meaningful assertions cannot be written:
- Explain WHY a proper test is not possible
- Do NOT generate a fake or placeholder test

## Review Process

1. **Collect tests** - Identify test files/functions in scope
2. **Demand declaration** - What behavior? What bug caught? Would it fail?
3. **Check forbidden patterns** - Any instant-fail types?
4. **Verify validity** - Satisfies at least one valid test criterion?
5. **Verdict** - PASS or REJECT with specific reason

## Output Format

For each test:
```
---
Test: [test name/description]
File: [file:line]
Verdict: PASS | REJECT

# If REJECT:
Violation: [forbidden type or missing criterion]
Problem: [why this test is weak]
Evidence: [the code pattern that triggered rejection]
Rewrite Guidance: [what a valid test would look like]
---
```

After all tests:
```
## Summary
Total: N tests reviewed
Passed: X
Rejected: Y

## Systemic Issues
[Patterns across rejected tests]
```

## Language Examples

### JavaScript/TypeScript (Jest/Vitest)

```typescript
// REJECT: Existence test
test('user exists', () => {
  const user = createUser();
  expect(user).toBeDefined();  // Proves nothing
});

// PASS: Tests behavior, catches duplicate ID bug
test('createUser generates unique ID for each call', () => {
  // BEHAVIOR: Each user gets a unique identifier
  // BUG CAUGHT: ID generation returning static/duplicate values
  const user1 = createUser();
  const user2 = createUser();
  expect(user1.id).not.toBe(user2.id);
});
```

### Python (pytest)

```python
# REJECT: Smoke-only, happy path only
def test_divide():
    assert divide(10, 2) == 5  # Would pass with return 5

# PASS: Tests error handling edge case
def test_divide_by_zero_raises():
    # BEHAVIOR: Division by zero raises error
    # BUG CAUGHT: Missing zero-check causing crash/undefined
    with pytest.raises(ZeroDivisionError):
        divide(10, 0)
```

### Go

```go
// REJECT: Implementation-coupled
func TestCache(t *testing.T) {
    c := NewCache()
    if c.items == nil { t.Fatal("items nil") }  // Tests internal
}

// PASS: Tests observable behavior
func TestCache_GetAfterSet(t *testing.T) {
    // BEHAVIOR: Cache returns previously stored values
    // BUG CAUGHT: Cache not persisting values correctly
    c := NewCache()
    c.Set("key", "value")
    got := c.Get("key")
    if got != "value" { t.Fatalf("got %q, want %q", got, "value") }
}
```

## Red Flags - Immediate Scrutiny

| Symptom | Likely Problem |
|---------|----------------|
| `toBeDefined()`, `!= nil`, `is not None` | Existence test |
| No assertions after setup | Test does nothing |
| Vague name ("works", "handles case") | Author doesn't know what it tests |
| 100% mocks, no real code | Testing the test, not the system |
| Only positive inputs | Edge cases ignored |
| Mirrors implementation step-by-step | Breaks on refactor, not on bugs |

## Reviewer Stance

- Never say "this test is fine" or "good enough"
- Assume every untested path will have a bug
- If you can't explain what bug it catches → reject it
- Coverage numbers mean nothing; only bug-catching ability matters
- Prioritize correctness, behavior, and regression safety

## Self-Check Before Approving

> "If I invert the core logic of the function under test, does this test fail?"

If no → the test is theater, not verification. Reject it.