# kv-eviction-labbook-loop > Audit-ready iteration workflow for KV-cache eviction experiments: validate run artifacts (summary.json/logs.jsonl), classify result_type/claim_scope, enforce causal-claim gates, and automatically write/update labbook logs (MASTER_LOG.md, daily/YYYY-MM-DD.md, INDEX.md) with fixed schemas. Use when running or tuning KV eviction/compression experiments and needing reproducible, reviewer-grade logging + next-step patch plan. - Author: Ay1men2 - Repository: Ay1men2/KV-cache-eviction - Version: 20260207170258 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-07 - Source: https://github.com/Ay1men2/KV-cache-eviction - Web: https://mule.run/skillshub/@@Ay1men2/KV-cache-eviction~kv-eviction-labbook-loop:20260207170258 --- --- name: kv-eviction-labbook-loop description: "Audit-ready iteration workflow for KV-cache eviction experiments: validate run artifacts (summary.json/logs.jsonl), classify result_type/claim_scope, enforce causal-claim gates, and automatically write/update labbook logs (MASTER_LOG.md, daily/YYYY-MM-DD.md, INDEX.md) with fixed schemas. Use when running or tuning KV eviction/compression experiments and needing reproducible, reviewer-grade logging + next-step patch plan." --- # KV Eviction Labbook Loop (One-Skill Workflow) ## Inputs required from user You must obtain at least ONE of: - a run directory path: `artifacts/runs//` - OR a full `summary.json` content Optional but recommended: - `logs.jsonl` snippets (first 3 + one eviction_decision_step=true + last 3) - `marker_diagnostics.json` key fields - change summary (git commit message or bullet edits) If both run_dir and summary are provided, prefer run_dir. ## Output contract (always produce in this order) 1) Gatekeeping verdict (result_type, claim_scope, paper-claim eligibility, single primary blocker) 2) Audit checklist (PASS/FAIL/UNKNOWN for required assertions) 3) Findings (facts only, numeric) 4) Mechanistic diagnosis (hypotheses labeled + falsification tests) 5) Minimal patch plan (1-3 actions, file/function + acceptance criteria) 6) Log updates written: - append entry in `docs/labbook/MASTER_LOG.md` - update `docs/labbook/INDEX.md` - create/append `docs/labbook/daily/YYYY-MM-DD.md` 7) Next experiment spec (config delta YAML keys) ## Hard gates (do not violate) - result_type in {no_eviction, simulated, mixed, physical} - claim_scope mapping: - physical -> eligible_for_causal_claim - mixed -> partial_causal_claim_with_caveats - simulated/no_eviction -> instrumentation_only_no_causal_claim - Never write global causal claims unless result_type=physical and shape evidence is closed. - Prompts must remain benign. Do not introduce harmful instruction content. ## Workflow (low freedom; follow exactly) ### Step A - Load artifacts 1) If run_dir: - read `summary.json` - if present, read `logs.jsonl` and `marker_diagnostics.json` 2) If only summary text was provided, operate on it and mark missing fields as UNKNOWN. ### Step B - Validate schema + compute assertions Run `scripts/validate_run_artifacts.py` to produce: - normalized fields - PASS/FAIL/UNKNOWN assertions - primary blocker string If scripts are unavailable, reproduce the same checks manually and state UNKNOWN for missing fields. ### Step C - Classify run + gate claims Use `result_type` from summary if present; otherwise derive: - no_eviction: eviction_steps_total==0 - physical: eviction_steps_total>0 AND simulated_eviction_steps==0 AND shape_check_pass_steps==physical_eviction_steps>0 - mixed: physical_eviction_steps>0 AND simulated_eviction_steps>0 - simulated: eviction_steps_total>0 AND physical_eviction_steps==0 Then assign claim_scope by mapping above. ### Step D - Write labbook updates (deterministic) Run `scripts/update_labbook.py` with: - run_dir or summary payload - date (Asia/Shanghai local date) - produce/append: - `docs/labbook/MASTER_LOG.md` (append-only) - `docs/labbook/daily/YYYY-MM-DD.md` (append run blocks; fixed template in assets) - `docs/labbook/INDEX.md` (index by date/run_id/result_type) Never invent missing values; if missing, write UNKNOWN. ### Step E - Next-step patch plan (minimal) Generate 1-3 actions prioritized: (1) physical evidence chain (cache backend detect + slice + shape) (2) functional checks (3) attention proxy (4) decoy effectiveness Each action MUST include: - file path(s) - function/module target - acceptance criterion measured in next `summary.json` / `logs.jsonl` ## References to load only when needed - For required field definitions and examples: `references/log-schema.md` - For claim wording constraints: `references/claim-gates.md` - For cache backend detection/slicing patterns: `references/cache-backends.md`