# testing

> Test Protocol

- Author: Tangi Vass
- Repository: altNightHawk/liza
- Version: 20260129005629
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/altNightHawk/liza
- Web: https://mule.run/skillshub/@@altNightHawk/liza~testing:20260129005629

---

---
name: testing
description: Test Protocol
---

Tests are the immune system — they reject bugs, not document them.

# Test Truth Table

| Code | Test | Interpretation |
|------|------|----------------|
| Working | Green | Good |
| Buggy | Red | Good (bug exposed) |
| Working | Red | Wrong expectations |
| Buggy | Green | **DANGEROUS** |
| Unknown | Red | **STOP — use Analysis Framework below** |

**ANTI-PATTERN:** The most dangerous instinct is to "fix" failing tests by accepting whatever source currently does.

# When Tests Fail

**Ask for context:**
- Is the code in production? Not a guarantee of correctness, yet to consider
- Is the test much more recent than the source code? Could explain wrong expectations

**Default heuristic:** Tests encode intent; assume the test is correct until evidence suggests otherwise. Source drifts; tests usually don't drift without reason.

**Analysis Framework:**
1. Examine both logics independently
2. Consider system design: method contract, error patterns
3. Identify sounder logic given context
4. When uncertain, ask user
5. Type hints may be wrong — verify actual runtime behavior before trusting signatures

# Test Modification

**FORBIDDEN without approval:**
- Changing assertions to match buggy behavior
- Weakening expectations
- Adding exception handlers to mask failures

**Allowed:** Fresh repro tests capturing current failure (existing tests untouched).

# Writing Tests

**Quality Standards:**
- Deterministic, behavioral testing over type checking
- Exception testing with message validation: `pytest.raises(ValueError, match="...")`
- Systematic edge cases: None, boundaries, dates
- Branch coverage: success and failure paths

**Assertion Strength:** Would this assertion pass for a broken implementation?
If yes (tautology, type-only, existence-only, shape-only), strengthen it.

**Mocking Discipline:** Test the code, not the scaffolding.
Heavy mock setup suggests you're testing assumptions about dependencies, not behavior.
Mock external boundaries (APIs, DBs, filesystems), not internal logic.

**Paired Coverage:** For each positive test, consider a negative counterpart.
Happy-path-only is a red flag for complex source.

**Coverage Relevance:** "Tests pass" ≠ "tests exercise changed code".
For non-trivial changes, verify tests cover modified paths.

**Signal Hierarchy:** Integration failures > Unit failures for architectural changes.
Green units with red integration = units work but don't compose.

**Structure:** Prefer GIVEN / WHEN / THEN blocks for readability.
Omit a section when empty; the structure clarifies intent, not ceremony.

**Template Principle:** Identify one high-quality test file as reference. Match its patterns.

**Flaky Tests:** Flakiness is a bug, not noise. When encountered: isolate, report, do not retry-until-green. Flakiness masks real failures.