# testing > Test Protocol - Author: Tangi Vass - Repository: altNightHawk/liza - Version: 20260129005629 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/altNightHawk/liza - Web: https://mule.run/skillshub/@@altNightHawk/liza~testing:20260129005629 --- --- name: testing description: Test Protocol --- Tests are the immune system — they reject bugs, not document them. # Test Truth Table | Code | Test | Interpretation | |------|------|----------------| | Working | Green | Good | | Buggy | Red | Good (bug exposed) | | Working | Red | Wrong expectations | | Buggy | Green | **DANGEROUS** | | Unknown | Red | **STOP — use Analysis Framework below** | **ANTI-PATTERN:** The most dangerous instinct is to "fix" failing tests by accepting whatever source currently does. # When Tests Fail **Ask for context:** - Is the code in production? Not a guarantee of correctness, yet to consider - Is the test much more recent than the source code? Could explain wrong expectations **Default heuristic:** Tests encode intent; assume the test is correct until evidence suggests otherwise. Source drifts; tests usually don't drift without reason. **Analysis Framework:** 1. Examine both logics independently 2. Consider system design: method contract, error patterns 3. Identify sounder logic given context 4. When uncertain, ask user 5. Type hints may be wrong — verify actual runtime behavior before trusting signatures # Test Modification **FORBIDDEN without approval:** - Changing assertions to match buggy behavior - Weakening expectations - Adding exception handlers to mask failures **Allowed:** Fresh repro tests capturing current failure (existing tests untouched). # Writing Tests **Quality Standards:** - Deterministic, behavioral testing over type checking - Exception testing with message validation: `pytest.raises(ValueError, match="...")` - Systematic edge cases: None, boundaries, dates - Branch coverage: success and failure paths **Assertion Strength:** Would this assertion pass for a broken implementation? If yes (tautology, type-only, existence-only, shape-only), strengthen it. **Mocking Discipline:** Test the code, not the scaffolding. Heavy mock setup suggests you're testing assumptions about dependencies, not behavior. Mock external boundaries (APIs, DBs, filesystems), not internal logic. **Paired Coverage:** For each positive test, consider a negative counterpart. Happy-path-only is a red flag for complex source. **Coverage Relevance:** "Tests pass" ≠ "tests exercise changed code". For non-trivial changes, verify tests cover modified paths. **Signal Hierarchy:** Integration failures > Unit failures for architectural changes. Green units with red integration = units work but don't compose. **Structure:** Prefer GIVEN / WHEN / THEN blocks for readability. Omit a section when empty; the structure clarifies intent, not ceremony. **Template Principle:** Identify one high-quality test file as reference. Match its patterns. **Flaky Tests:** Flakiness is a bug, not noise. When encountered: isolate, report, do not retry-until-green. Flakiness masks real failures.