# practicing-tdd > Use when implementing any feature or bugfix, before writing implementation code - write the test first, watch it fail, write minimal code to pass; ensures tests actually verify behavior by requiring failure first - Author: Samuel Julius Hecht - Repository: samjhecht/wrangler - Version: 20260205143502 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-07 - Source: https://github.com/samjhecht/wrangler - Web: https://mule.run/skillshub/@@samjhecht/wrangler~practicing-tdd:20260205143502 --- --- name: practicing-tdd description: Use when implementing any feature or bugfix, before writing implementation code - write the test first, watch it fail, write minimal code to pass; ensures tests actually verify behavior by requiring failure first --- # Test-Driven Development (TDD) ## Overview Write the test first. Watch it fail. Write minimal code to pass. **Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing. **Violating the letter of the rules is violating the spirit of the rules.** ## When to Use **Always:** - New features - Bug fixes - Refactoring - Behavior changes **Exceptions (ask your human partner):** - Throwaway prototypes - Generated code - Configuration files Thinking "skip TDD just this once"? Stop. That's rationalization. ## The Iron Law ``` NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST ``` Write code before the test? Delete it. Start over. **No exceptions:** - Don't keep it as "reference" - Don't "adapt" it while writing tests - Don't look at it - Delete means delete Implement fresh from tests. Period. ## Red-Green-Refactor ```dot digraph tdd_cycle { rankdir=LR; red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"]; verify_red [label="Verify fails\ncorrectly", shape=diamond]; green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"]; verify_green [label="Verify passes\nAll green", shape=diamond]; refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"]; next [label="Next", shape=ellipse]; red -> verify_red; verify_red -> green [label="yes"]; verify_red -> red [label="wrong\nfailure"]; green -> verify_green; verify_green -> refactor [label="yes"]; verify_green -> green [label="no"]; refactor -> verify_green [label="stay\ngreen"]; verify_green -> next; next -> red; } ``` ### RED - Write Failing Test Write one minimal test showing what should happen. ```typescript test('retries failed operations 3 times', async () => { let attempts = 0; const operation = () => { attempts++; if (attempts < 3) throw new Error('fail'); return 'success'; }; const result = await retryOperation(operation); expect(result).toBe('success'); expect(attempts).toBe(3); }); ``` Clear name, tests real behavior, one thing ```typescript test('retry works', async () => { const mock = jest.fn() .mockRejectedValueOnce(new Error()) .mockRejectedValueOnce(new Error()) .mockResolvedValueOnce('success'); await retryOperation(mock); expect(mock).toHaveBeenCalledTimes(3); }); ``` Vague name, tests mock not code **Requirements:** - One behavior - Clear name - Real code (no mocks unless unavoidable) ### Verify RED - Watch It Fail (MANDATORY EVIDENCE) BEFORE proceeding to GREEN phase: 1. **Execute test command**: ```bash npm test -- path/to/test.test.ts # or pytest path/to/test.py::test_function_name # or cargo test test_function_name ``` 2. **Copy full output showing failure** 3. **Verify failure message matches expected reason**: - ✅ CORRECT: "ReferenceError: retryOperation is not defined" - ✅ CORRECT: "AssertionError: expected 3 to equal undefined" - ❌ WRONG: "TypeError: Cannot read property 'X' of undefined" (syntax error, not missing implementation) - ❌ WRONG: Test passes (you didn't write a failing test!) 4. **If output doesn't match expected failure**: Fix test and re-run **YOU MUST include test output in your message:** #### Example of Required Evidence: ``` Running RED phase verification: $ npm test -- retry.test.ts FAIL tests/retry.test.ts ✕ retries failed operations 3 times (2 ms) ● retries failed operations 3 times ReferenceError: retryOperation is not defined at Object. (tests/retry.test.ts:15:5) Test Suites: 1 failed, 1 total Tests: 1 failed, 1 total Time: 0.234s Exit code: 1 This is the expected failure - function doesn't exist yet. Failure reason matches expectation: "retryOperation is not defined" Proceeding to GREEN phase. ``` **Claims without evidence violate verifying-before-completion.** If you cannot provide this output, you have NOT completed the RED phase. ### GREEN - Minimal Code Write simplest code to pass the test. ```typescript async function retryOperation(fn: () => Promise): Promise { for (let i = 0; i < 3; i++) { try { return await fn(); } catch (e) { if (i === 2) throw e; } } throw new Error('unreachable'); } ``` Just enough to pass ```typescript async function retryOperation( fn: () => Promise, options?: { maxRetries?: number; backoff?: 'linear' | 'exponential'; onRetry?: (attempt: number) => void; } ): Promise { // YAGNI } ``` Over-engineered Don't add features, refactor other code, or "improve" beyond the test. ### Verify GREEN - Watch It Pass (MANDATORY EVIDENCE) AFTER implementing minimal code: 1. **Execute test command** (same as RED): ```bash npm test -- path/to/test.test.ts ``` 2. **Copy full output showing pass** 3. **Verify ALL of these**: - All tests pass (0 failures) - No errors printed - No warnings printed - Exit code: 0 - Test that was failing now passes **YOU MUST include test output in your message:** #### Example of Required Evidence: ``` Running GREEN phase verification: $ npm test -- retry.test.ts PASS tests/retry.test.ts ✓ retries failed operations 3 times (145 ms) Test Suites: 1 passed, 1 total Tests: 1 passed, 1 total Time: 0.189s Exit code: 0 Test now passes. Proceeding to REFACTOR phase. ``` **If any errors/warnings appear**: Fix them before claiming GREEN phase complete. **Claims without evidence violate verifying-before-completion.** If you cannot provide this output, you have NOT completed the GREEN phase. ### REFACTOR - Clean Up After green only: - Remove duplication - Improve names - Extract helpers Keep tests green. Don't add behavior. ### Repeat Next failing test for next feature. ## Good Tests | Quality | Good | Bad | |---------|------|-----| | **Minimal** | One thing. "and" in name? Split it. | `test('validates email and domain and whitespace')` | | **Clear** | Name describes behavior | `test('test1')` | | **Shows intent** | Demonstrates desired API | Obscures what code should do | ## Why Order Matters **"I'll write tests after to verify it works"** Tests written after code pass immediately. Passing immediately proves nothing: - Might test wrong thing - Might test implementation, not behavior - Might miss edge cases you forgot - You never saw it catch the bug Test-first forces you to see the test fail, proving it actually tests something. **"I already manually tested all the edge cases"** Manual testing is ad-hoc. You think you tested everything but: - No record of what you tested - Can't re-run when code changes - Easy to forget cases under pressure - "It worked when I tried it" ≠ comprehensive Automated tests are systematic. They run the same way every time. **"Deleting X hours of work is wasteful"** Sunk cost fallacy. The time is already gone. Your choice now: - Delete and rewrite with TDD (X more hours, high confidence) - Keep it and add tests after (30 min, low confidence, likely bugs) The "waste" is keeping code you can't trust. Working code without real tests is technical debt. **"TDD is dogmatic, being pragmatic means adapting"** TDD IS pragmatic: - Finds bugs before commit (faster than debugging after) - Prevents regressions (tests catch breaks immediately) - Documents behavior (tests show how to use code) - Enables refactoring (change freely, tests catch breaks) "Pragmatic" shortcuts = debugging in production = slower. **"Tests after achieve the same goals - it's spirit not ritual"** No. Tests-after answer "What does this do?" Tests-first answer "What should this do?" Tests-after are biased by your implementation. You test what you built, not what's required. You verify remembered edge cases, not discovered ones. Tests-first force edge case discovery before implementing. Tests-after verify you remembered everything (you didn't). 30 minutes of tests after ≠ TDD. You get coverage, lose proof tests work. ## Common Rationalizations | Excuse | Reality | |--------|---------| | "Too simple to test" | Simple code breaks. Test takes 30 seconds. | | "I'll test after" | Tests passing immediately prove nothing. | | "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" | | "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. | | "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. | | "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. | | "Need to explore first" | Fine. Throw away exploration, start with TDD. | | "Test hard = design unclear" | Listen to test. Hard to test = hard to use. | | "TDD will slow me down" | TDD faster than debugging. Pragmatic = test-first. | | "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. | | "Existing code has no tests" | You're improving it. Add tests for existing code. | ### "I watched it fail/pass in my head" **Counter**: Your imagination is not evidence. Run the actual command and paste the output. ### "The test obviously fails/passes, I don't need to show output" **Counter**: Non-obvious bugs exist. Provide output or you didn't verify. ### "I'll just say it failed/passed" **Counter**: That's a claim without evidence. Violation of verifying-before-completion. Show output. ## Red Flags - STOP and Start Over - Code before test - Test after implementation - Test passes immediately - Can't explain why test failed - Tests added "later" - Rationalizing "just this once" - "I already manually tested it" - "Tests after achieve the same purpose" - "It's about spirit not ritual" - "Keep as reference" or "adapt existing code" - "Already spent X hours, deleting is wasteful" - "TDD is dogmatic, I'm being pragmatic" - "This is different because..." **All of these mean: Delete code. Start over with TDD.** ## Example: Bug Fix **Bug:** Empty email accepted **RED** ```typescript test('rejects empty email', async () => { const result = await submitForm({ email: '' }); expect(result.error).toBe('Email required'); }); ``` **Verify RED** ```bash $ npm test FAIL: expected 'Email required', got undefined ``` **GREEN** ```typescript function submitForm(data: FormData) { if (!data.email?.trim()) { return { error: 'Email required' }; } // ... } ``` **Verify GREEN** ```bash $ npm test PASS ``` **REFACTOR** Extract validation for multiple fields if needed. ## TDD Compliance Certification BEFORE claiming work complete, certify TDD compliance: For each new function/method implemented: - [ ] **Function name**: [function_name] - **Test name**: [test_function_name] - **Watched fail**: YES / NO (if NO, explain why) - **Failure reason**: [expected failure message you saw] - **Implemented minimal code**: YES / NO - **Watched pass**: YES / NO - **Refactored**: YES / NO / N/A ### Example Certification: - [x] **Function name**: retryOperation - **Test name**: test_retries_failed_operations_3_times - **Watched fail**: YES - **Failure reason**: "ReferenceError: retryOperation is not defined" - **Implemented minimal code**: YES - **Watched pass**: YES - **Refactored**: YES (extracted delay logic) **Requirements:** - If ANY "NO" answers: Work is NOT complete. Delete and restart with TDD. - This certification MUST be included in your completion message. - One entry required for each new function/method. **Why this matters:** - Required by verifying-before-completion skill - Proves you followed TDD (not just testing after) - Creates audit trail for code review - Makes rationalization harder (explicit lying vs fuzzy thinking) **Cross-reference:** See verifying-before-completion skill for complete requirements. ## Frontend-Specific TDD Frontend testing has special cases: ### Visual Regression Tests **First run generates baseline** (special case): 1. Write functional test (RED-GREEN-REFACTOR) 2. Add visual test (generates baseline) 3. Future changes: Visual test catches regressions (RED if broken) ## Workflow Checklist Copy this checklist to track your progress: See `assets/workflow-checklist.md` for the complete checklist. ## References For detailed information, see: - `references/detailed-guide.md` - Complete workflow details, examples, and troubleshooting