# systematic-debugging

> Guides root cause investigation through four phases before fixes. Use when encountering any bug, test failure, or unexpected behavior.

- Author: Terry Huang
- Repository: clthuang/my-ai-setup
- Version: 20260201022554
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/clthuang/my-ai-setup
- Web: https://mule.run/skillshub/@@clthuang/my-ai-setup~systematic-debugging:20260201022554

---

---
name: systematic-debugging
description: Guides root cause investigation through four phases before fixes. Use when encountering any bug, test failure, or unexpected behavior.
---

# Systematic Debugging

Random fixes waste time and create new bugs.

## The Iron Law

```
NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
```

If you haven't completed Phase 1, you cannot propose fixes.

## The Four Phases

### Phase 1: Root Cause Investigation

**Before ANY fix:**

1. **Read error messages carefully** - They often contain the solution
2. **Reproduce consistently** - Can you trigger it reliably?
3. **Check recent changes** - What changed that could cause this?
4. **Trace data flow** - Where does the bad value originate?

### Phase 2: Pattern Analysis

1. **Find working examples** - Similar working code in same codebase
2. **Compare against references** - Read reference implementation completely
3. **Identify differences** - What's different between working and broken?

### Phase 3: Hypothesis and Testing

1. **Form single hypothesis** - "I think X is the root cause because Y"
2. **Test minimally** - Make SMALLEST possible change
3. **Verify before continuing** - Did it work? If not, new hypothesis

### Phase 4: Implementation

1. **Create failing test** - Reproduce the bug in a test
2. **Implement single fix** - Address root cause, ONE change
3. **Verify fix** - Test passes? Other tests still pass?
4. **If 3+ fixes failed** - STOP. Question the architecture.

## Red Flags - STOP

- "Quick fix for now"
- "Just try changing X"
- Proposing solutions before investigation
- "One more fix attempt" (after 2+ failures)

**ALL mean: Return to Phase 1.**

## Common Rationalizations

| Excuse | Reality |
|--------|---------|
| "Issue is simple" | Simple issues have root causes too |
| "Emergency, no time" | Systematic is FASTER than thrashing |
| "I see the problem" | Seeing symptoms ≠ understanding root cause |
| "One more try" (after 2+) | 3+ failures = architectural problem |

## 3-Fix Rule

If you've tried 3 fixes without success:
- STOP attempting more fixes
- Question the architecture
- Discuss with user before continuing

This indicates an architectural problem, not a bug.

## Reference Materials

**Deep Dive Techniques:**
- [Root Cause Tracing](references/root-cause-tracing.md) - Trace backward through call chains
- [Defense in Depth](references/defense-in-depth.md) - Multi-layer validation
- [Condition-Based Waiting](references/condition-based-waiting.md) - Fix flaky tests

**Scripts:**
- [find-polluter.sh](scripts/find-polluter.sh) - Find which test creates pollution

## Quick Reference: Tracing Techniques

| Technique | When to Use |
|-----------|-------------|
| Stack trace logging | Bug deep in execution |
| `console.error()` | Tests suppress logger |
| `new Error().stack` | See complete call chain |
| Bisection script | Unknown test pollution |

## Quick Reference: Defense Layers

| Layer | Purpose |
|-------|---------|
| Entry validation | Reject invalid at API boundary |
| Business logic | Ensure data makes sense for operation |
| Environment guards | Prevent dangerous ops in test/prod |
| Debug instrumentation | Capture context for forensics |