# e2e-investigate

> Investigate e2e test failures, diagnose root causes, and generate actionable tasks for /create-task.

- Author: dearkuya
- Repository: cybervaldez/playbook-react-cli-example
- Version: 20260210015152
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-09
- Source: https://github.com/cybervaldez/playbook-react-cli-example
- Web: https://mule.run/skillshub/@@cybervaldez/playbook-react-cli-example~e2e-investigate:20260210015152

---

---
name: e2e-investigate
description: Investigate e2e test failures, diagnose root causes, and generate actionable tasks for /create-task.
---

## TL;DR

**What:** Root cause analysis for `/e2e` failures. Reads artifacts, diagnoses issues.

**When:** After `/e2e` fails. Required after 3+ consecutive failures.

**Output:** Task description ready for `/create-task` with root cause and suggested fix.

---

## Tech Context Detection

Before investigating, check for technology-specific failure patterns:

1. **Scan failing tests and error logs** for technology mentions
2. **For each tech detected:**
   - Check if `techs/{tech}/README.md` exists — if not, run `/research {tech}` first
   - Check if `references/{tech}.md` exists in this skill's directory
   - If not AND tech's domain affects this skill, produce reference doc:
     - Read `TECH_CONTEXT.md` for the Skill Concern Matrix
     - Evaluate concerns: Failure patterns? Log formats? Reproduction? Debug tools?
     - If 2+ concerns relevant → produce `references/{tech}.md`
3. **Read relevant reference docs** and apply tech-specific investigation patterns

**Domains that affect this skill:** Testing Tools, State Management (common failure modes), Data Fetching

---

# E2E Failure Investigation

Analyze failing `/e2e` test results, investigate root causes, and produce actionable task descriptions ready for `/create-task`.

**Workflow:** `/e2e` (fails) -> `/e2e-investigate` -> `/create-task` (fix)

## When to Trigger

| Scenario | Action |
|----------|--------|
| `/e2e` fails once | Review failure, attempt quick fix, re-run `/e2e` |
| `/e2e` fails twice | Check if same test; consider `/e2e-investigate` |
| `/e2e` fails 3+ times | **Mandatory:** Run `/e2e-investigate` before more retries |
| Flaky test (passes sometimes) | Run `/e2e-investigate` to identify timing issues |
| New failure after code change | Quick fix first; `/e2e-investigate` if unclear |

### Invocation

```bash
# Automatic (reads latest e2e run)
/e2e-investigate

# After specific run
/e2e-investigate tests//e2e-runs/20240115_143022/
```

The skill automatically reads `tests//e2e-runs/latest/` if no path is specified.

## Quick Start

```bash
/e2e-investigate
```

No arguments needed. Automatically reads the latest e2e run artifacts.

## What It Does

1. **Parse Failure**: Read `tests//e2e-runs/latest/report.md` to identify failed phases
2. **Gather Evidence**: Collect screenshots, server logs, error messages
3. **Investigate Root Cause**: Analyze code, check patterns, identify the bug
4. **Diagnose Issue Type**: Categorize (API, UI, state, timing, persistence)
5. **Generate Task**: Output structured task description for `/create-task`

---

## Investigation Workflow

### Step 1: Parse Failure Report

```bash
# Read the latest report
cat tests//e2e-runs/latest/report.md

# Extract failed phases
grep -A5 "FAIL" tests//e2e-runs/latest/report.md
```

Look for:
- Which phase(s) failed
- Error messages in the Errors section
- Duration (short = early failure, long = timeout)
- Suggested fixes already in the report

### Step 2: Gather Evidence (Screenshots as Reference)

> **Note:** Use screenshots as reference when live reproduction is not possible.
> Prefer Step 2.1 (live debugging) when you can interact with the running server.

**Server Logs:**
```bash
# Check for exceptions and stack traces
cat tests//e2e-runs/latest/server.log | grep -E "(Error|Exception|Traceback|404|500)" -A5

# For server restart issues
cat tests//e2e-runs/latest/server_restart.log
```

**Screenshots:**
```bash
# List available screenshots
ls tests//e2e-runs/latest/screenshots/

# View specific screenshot (use Read tool)
# tests//e2e-runs/latest/screenshots/06-server-restart.png
```

**What to look for in screenshots:**
| Screenshot | Check For |
|------------|-----------|
| 01-startup-clean.png | Main UI visible, no error modals |
| 02-navigation.png | Content loaded, parameters set |
| 03-generation.png | Generated content visible in grid |
| 04-post-generation.png | Counts updated, state reflects changes |
| 05-persistence.png | Same state after refresh |
| 06-server-restart.png | Data still visible after restart |

### Step 2.1: Reproduce Issue Live (Preferred)

Before analyzing screenshots, try to reproduce the failure interactively:

```bash
# Start server
npm run dev

# Reproduce the failing scenario with curl
curl -sf "{{BASE_URL}}/api/endpoint" | jq '.'

# Or reproduce with agent-browser snapshot (not screenshot)
agent-browser open "{{BASE_URL}}/"
agent-browser snapshot -c | grep "expected-element"
```

**Why live debugging first:**
- Active reproduction reveals more than static screenshot analysis
- You can try variations and narrow down the root cause
- Snapshot + curl are low-cost verification methods (~500 tokens vs ~2000+ for screenshots)

If you cannot reproduce the issue live, then analyze the screenshot artifacts from `/e2e`.

### Step 3: Investigate Root Cause

Search codebase for related code:

```bash
# For API issues
grep -r "relevant_endpoint" src/server/api/

# For UI issues
grep -r "component_name" src/js/

# For state issues
grep -r "fallback\|default\||| " src/js/

# For persistence issues
grep -r "scan\|index\|startup\|outputs" src/server/
```

**Check for anti-patterns:**
- Silent defaults: `x || 42`, `x ?? defaultValue`
- Missing error handling: `catch(e) { }`
- Hardcoded values that should be dynamic
- Race conditions in async code

### Step 4: Categorize Issue Type

| Category | Symptoms | Investigation Focus |
|----------|----------|---------------------|
| **API/Data** | 500 errors, missing fields, wrong data | `src/server/api/*.py`, server.log |
| **UI Element** | Element not found, wrong visibility | `src/js/*.js`, data-testid attrs |
| **State** | Wrong/stale state, 0/0 counts | State initialization, fallback logic |
| **Timing** | Flaky, timeout, intermittent | Sleep durations, async handling |
| **Persistence** | Data lost on restart/refresh | Server startup scan, file I/O |

**Common Root Causes by Category:**

**API/Data:**
- Endpoint returns wrong data structure
- Missing error propagation
- Query parameters not parsed

**UI Element:**
- Missing data-testid
- Element hidden/conditional
- Wrong selector in test

**State:**
- Fallback defaults hiding real errors
- State not initialized before access
- URL params not read correctly

**Timing:**
- Insufficient wait times
- Race between API and render
- SSE events not awaited

**Persistence:**
- Server doesn't scan outputs on startup
- Files written to wrong location
- State not preserved in URL

### Step 5: Generate Task Description

Output this format for `/create-task`:

```markdown
## Fix: [Specific Issue Title]

### Root Cause
[What's broken and why - with file:line references]

### Expected Behavior
[What should happen when working correctly]

### Suggested Fix
[Code location and approach]

### Verification
[How /e2e will verify the fix works]
```

---

## E2E Artifacts Reference

After `/e2e` runs, these artifacts exist:

```
tests//e2e-runs/latest/          <- Symlink to most recent run
├── report.md                   <- Structured failure report
├── server.log                  <- Server output with exceptions
├── server_restart.log          <- Phase 7 restart log (if applicable)
└── screenshots/
    ├── 01-startup-clean.png
    ├── 02-navigation.png
    ├── 03-generation.png
    ├── 04-post-generation.png
    ├── 05-persistence.png
    └── 06-server-restart.png
```

**Report.md Structure:**
- Summary (phases, passed/failed, duration)
- Phase Results (status, duration, errors per phase)
- Screenshots list
- Failures section with suggested fixes

---

## Evidence Checklist

For each failure type, check these:

### API Failures
- [ ] server.log for 4xx/5xx errors
- [ ] Stack traces in server.log
- [ ] API endpoint code in `src/server/api/`
- [ ] Request/response format

### UI Failures
- [ ] Screenshot for visual state
- [ ] JS errors in server.log (browser console)
- [ ] Component code in `src/js/`
- [ ] data-testid attributes present

### State Failures
- [ ] Screenshot shows wrong values
- [ ] Check for fallback defaults in JS
- [ ] URL parameter handling
- [ ] State initialization order

### Timing Failures
- [ ] Check test sleep durations
- [ ] Look for race conditions
- [ ] SSE event handling
- [ ] Async/await usage

### Persistence Failures
- [ ] server_restart.log for errors
- [ ] Screenshot shows state lost
- [ ] Server startup scan logic
- [ ] File existence in outputs/

---

## Example Investigation

**Scenario:** Phase 7 (Server Restart) fails with "Data not visible after server restart"

### Step 1: Parse Failure

```
Phase 7: Server Restart - FAIL
- Duration: 6s
- Errors: Data not visible after server restart
```

### Step 2: Gather Evidence

```bash
# Screenshot shows "ITEMS (0)" in sidebar
# Read: tests//e2e-runs/latest/screenshots/06-server-restart.png

# Server restart log shows clean startup
cat tests//e2e-runs/latest/server_restart.log
# No exceptions, server started on port 5173

# Check if outputs exist
ls public/outputs/
# Files are present on disk
```

**Finding:** Files exist on disk but UI shows 0 count.

### Step 3: Investigate

```bash
# How does server discover outputs on startup?
grep -r "scan\|index" src/server/ --include="*.py"

# Found: scan_repair.py handles output discovery
# Check if it scans correctly
grep -r "scan" src/server/scan_repair.py
```

**Finding:** Server scans outputs but may not index them correctly.

### Step 4: Categorize

- **Type**: Persistence issue
- **Focus**: Server startup scan logic
- **Pattern**: State initialization

### Step 5: Generate Task

```markdown
## Fix: Generated data not persisted across server restart

### Root Cause
The server's startup scan in `src/server/scan_repair.py` does not
index existing files correctly. Files are generated and saved
to disk, but when the server restarts, they are not re-discovered.

The scan logic may be filtering incorrectly or not reading the
correct status files.

### Expected Behavior
After server restart, previously generated items should:
1. Appear in the sidebar with correct counts
2. Be visible in the main grid
3. Show "ITEMS (N)" where N > 0

### Suggested Fix
1. Check `src/server/scan_repair.py` startup logic
2. Ensure all items are indexed in the scan
3. Verify status files are read correctly from outputs/

### Verification
Run `/e2e` - Phase 7 (Server Restart) should pass with data visible.
```

---

## Failure Patterns Reference

### Pattern: "0/0" Counts
- **Symptom**: UI shows 0/0 even with data
- **Cause**: Fallback defaults, scan not running
- **Check**: UI state code, API endpoints

### Pattern: "Data not visible after restart"
- **Symptom**: Count drops to 0 after server restart
- **Cause**: Server doesn't re-scan outputs on startup
- **Check**: Server startup sequence, scan logic

### Pattern: "Element not found"
- **Symptom**: Test can't find UI element
- **Cause**: Missing data-testid, element not rendered
- **Check**: Component code, conditional rendering

### Pattern: "State lost after refresh"
- **Symptom**: Phase 6 fails, state resets
- **Cause**: URL params not preserved, localStorage issue
- **Check**: URL handling, state persistence code

### Pattern: "Generation timeout"
- **Symptom**: Phase 4 fails after long duration
- **Cause**: Generation hung, wrong parameters
- **Check**: server.log for generation progress, API calls

---

## Integration with /create-task

The output from `/e2e-investigate` is designed to feed directly into `/create-task`:

1. Run `/e2e` -> get failure
2. Run `/e2e-investigate` -> get task description
3. Copy task description to `/create-task <description>`
4. Implement fix
5. Run `/e2e` -> verify fix

---

## Starting Servers for Debugging

**IMPORTANT:** Always use the startup scripts, not raw python commands.

```bash
# Start server (default port 5173)
npm run dev
```

**Never use raw python commands** - the startup scripts handle venv activation, default flags, and port configuration.

## Limitations

- **Read-only** - Investigates failures but doesn't fix them; outputs to `/create-task`
- **Pipeline position** - Triggered by `/e2e` failures; feeds into `/create-task` for fixes
- **Prerequisites** - Requires `/e2e` to have run and failed; needs failure artifacts in `tests//e2e-runs/`
- **Not suitable for** - Successful test runs; test authoring (use `/e2e-guard` for that)
- **Artifact dependency** - Screenshots and logs must exist; if artifacts are missing, re-run `/e2e` first

| Limitation | Next Step |
|------------|-----------|
| No artifacts | Run `/e2e` first to generate failure artifacts |

## See Also

- `/e2e` - Run the full e2e test suite
- `/create-task` - Implement the fix based on investigation
- `/coding-guard` - Audit code changes
- `/e2e-guard` - Verify test coverage