# kv-eviction-labbook-loop

> Audit-ready iteration workflow for KV-cache eviction experiments: validate run artifacts (summary.json/logs.jsonl), classify result_type/claim_scope, enforce causal-claim gates, and automatically write/update labbook logs (MASTER_LOG.md, daily/YYYY-MM-DD.md, INDEX.md) with fixed schemas. Use when running or tuning KV eviction/compression experiments and needing reproducible, reviewer-grade logging + next-step patch plan.

- Author: Ay1men2
- Repository: Ay1men2/KV-cache-eviction
- Version: 20260207170258
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-07
- Source: https://github.com/Ay1men2/KV-cache-eviction
- Web: https://mule.run/skillshub/@@Ay1men2/KV-cache-eviction~kv-eviction-labbook-loop:20260207170258

---

---
name: kv-eviction-labbook-loop
description: "Audit-ready iteration workflow for KV-cache eviction experiments: validate run artifacts (summary.json/logs.jsonl), classify result_type/claim_scope, enforce causal-claim gates, and automatically write/update labbook logs (MASTER_LOG.md, daily/YYYY-MM-DD.md, INDEX.md) with fixed schemas. Use when running or tuning KV eviction/compression experiments and needing reproducible, reviewer-grade logging + next-step patch plan."
---

# KV Eviction Labbook Loop (One-Skill Workflow)

## Inputs required from user
You must obtain at least ONE of:
- a run directory path: `artifacts/runs/<run_id>/`
- OR a full `summary.json` content
Optional but recommended:
- `logs.jsonl` snippets (first 3 + one eviction_decision_step=true + last 3)
- `marker_diagnostics.json` key fields
- change summary (git commit message or bullet edits)

If both run_dir and summary are provided, prefer run_dir.

## Output contract (always produce in this order)
1) Gatekeeping verdict (result_type, claim_scope, paper-claim eligibility, single primary blocker)
2) Audit checklist (PASS/FAIL/UNKNOWN for required assertions)
3) Findings (facts only, numeric)
4) Mechanistic diagnosis (hypotheses labeled + falsification tests)
5) Minimal patch plan (1-3 actions, file/function + acceptance criteria)
6) Log updates written:
   - append entry in `docs/labbook/MASTER_LOG.md`
   - update `docs/labbook/INDEX.md`
   - create/append `docs/labbook/daily/YYYY-MM-DD.md`
7) Next experiment spec (config delta YAML keys)

## Hard gates (do not violate)
- result_type in {no_eviction, simulated, mixed, physical}
- claim_scope mapping:
  - physical -> eligible_for_causal_claim
  - mixed -> partial_causal_claim_with_caveats
  - simulated/no_eviction -> instrumentation_only_no_causal_claim
- Never write global causal claims unless result_type=physical and shape evidence is closed.
- Prompts must remain benign. Do not introduce harmful instruction content.

## Workflow (low freedom; follow exactly)
### Step A - Load artifacts
1) If run_dir:
   - read `summary.json`
   - if present, read `logs.jsonl` and `marker_diagnostics.json`
2) If only summary text was provided, operate on it and mark missing fields as UNKNOWN.

### Step B - Validate schema + compute assertions
Run `scripts/validate_run_artifacts.py` to produce:
- normalized fields
- PASS/FAIL/UNKNOWN assertions
- primary blocker string
If scripts are unavailable, reproduce the same checks manually and state UNKNOWN for missing fields.

### Step C - Classify run + gate claims
Use `result_type` from summary if present; otherwise derive:
- no_eviction: eviction_steps_total==0
- physical: eviction_steps_total>0 AND simulated_eviction_steps==0 AND shape_check_pass_steps==physical_eviction_steps>0
- mixed: physical_eviction_steps>0 AND simulated_eviction_steps>0
- simulated: eviction_steps_total>0 AND physical_eviction_steps==0
Then assign claim_scope by mapping above.

### Step D - Write labbook updates (deterministic)
Run `scripts/update_labbook.py` with:
- run_dir or summary payload
- date (Asia/Shanghai local date)
- produce/append:
  - `docs/labbook/MASTER_LOG.md` (append-only)
  - `docs/labbook/daily/YYYY-MM-DD.md` (append run blocks; fixed template in assets)
  - `docs/labbook/INDEX.md` (index by date/run_id/result_type)

Never invent missing values; if missing, write UNKNOWN.

### Step E - Next-step patch plan (minimal)
Generate 1-3 actions prioritized:
(1) physical evidence chain (cache backend detect + slice + shape)
(2) functional checks
(3) attention proxy
(4) decoy effectiveness
Each action MUST include:
- file path(s)
- function/module target
- acceptance criterion measured in next `summary.json` / `logs.jsonl`

## References to load only when needed
- For required field definitions and examples: `references/log-schema.md`
- For claim wording constraints: `references/claim-gates.md`
- For cache backend detection/slicing patterns: `references/cache-backends.md`