# smart-screenshot-qa > Efficient frontend QA using the right verification method. Use when doing browser-based QA, taking screenshots, verifying UI changes, or when screenshot loops are wasting tokens. Optimizes between DOM inspection, targeted zoom, and full screenshots. - Author: Thomas Hoffman - Repository: tghoffdev/smart-screenshot-qa-skill - Version: 20260202171553 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/tghoffdev/smart-screenshot-qa-skill - Web: https://mule.run/skillshub/@@tghoffdev/smart-screenshot-qa-skill~smart-screenshot-qa:20260202171553 --- --- name: smart-screenshot-qa description: Efficient frontend QA using the right verification method. Use when doing browser-based QA, taking screenshots, verifying UI changes, or when screenshot loops are wasting tokens. Optimizes between DOM inspection, targeted zoom, and full screenshots. --- # Smart Screenshot QA Stop the screenshot spirals. Choose the right verification method for the job. ## When NOT to Use This Skill This skill optimizes for verify-and-move-on. Skip it when: - Visual regression testing requires systematic before/after comparisons - Accessibility audits need comprehensive coverage - QA specs explicitly require thoroughness over efficiency ## Token Costs (Tested on GitHub.com) | Method | Tokens | Best For | |--------|--------|----------| | Targeted zoom (region) | 100-200 | Component styling, specific elements | | `find` (natural language) | 500-1,000 | "Is there a submit button?" without full DOM | | `read_page` (filter: "interactive") | 800-2,000 | Structural checks, element existence | | Full viewport screenshot | 1,000-1,500 | Layout verification, final sign-off | | `read_page` (full DOM) | 6,000-25,000+ | Avoid on complex pages | **Key insight:** Filtered DOM and full screenshots are comparable in cost. The real wins: targeted zoom is 5-10x cheaper than full screenshots, and avoiding full DOM dumps saves 15-20x. DOM is faster for structural questions, not necessarily cheaper. ## Decision Tree **"Is there a submit button somewhere?" (don't know the selector)** → `find` with natural language. Mid-cost, no selector needed. **"Does element X exist / have correct text / attributes?" (know what you're looking for)** → `read_page` with `filter: "interactive"`. Faster than screenshots for structural checks. **"Does this button/component look right?"** → `zoom` on that specific region. 5-10x cheaper than full screenshot. **"Is the layout/spacing/alignment correct?"** → Single full screenshot, but only after batching changes. **"Final check before shipping"** → One full screenshot. Done. Move on. ## Anti-Patterns (The Real Token Killers) 1. **Screenshot loops** - Taking the same screenshot over and over "to make sure." One verification that passes = done. 2. **Full screenshot for one component** - Use targeted zoom. 5-10x cheaper. 3. **Screenshot after every change** - Batch 3-5 styling/content changes, then one screenshot. For complex layout changes (flexbox, z-index, grid), verify sooner to isolate regressions. 4. **Retaking screenshots you already have** - Reference existing imageId if nothing visual changed. 5. **Full DOM dump on complex pages** - Always use `filter: "interactive"` (60-80% savings). 6. **Full-page scroll screenshots** - Use zoom for specific sections. 7. **Retrying failed zooms** - If `zoom` returns empty or ambiguous, fall back to full screenshot immediately. Don't retry. ## Why Loops Are The Real Problem Individual screenshots aren't that expensive (700-1,500 tokens). The problem is spirals: ``` change → screenshot → tweak → screenshot → "let me check" → screenshot → "one more" → screenshot ``` 5 unnecessary screenshots = 3,500-7,500 wasted tokens. Per component. Per session. ## Before/After Comparisons You cannot retroactively view old imageIds. They persist for the browser session but disappear from context when it gets summarized. Before taking a "before" screenshot: - Note key visual details in text (spacing, colors, positions) - Make changes - Take "after" screenshot and compare against your notes If you need to reference a screenshot later in a long session, note the key details in text immediately after taking it. ## Exit Criteria (When QA is Done) Stop when: - The specific visual or structural property requested has been verified - No relevant console errors (`read_console_messages`) - Interactive elements respond correctly (if applicable) Do NOT keep screenshotting to make it "perfect" or to "double-check." ## Quick Reference ``` "Is there a [thing]?" → find (natural language, mid-cost) Element exists? → read_page filter:"interactive" (fastest for structural) Text/attribute check → read_page filter:"interactive" Component styling → zoom (5-10x cheaper than full screenshot) Layout verification → single screenshot after batching Final sign-off → one screenshot, then stop Full DOM → avoid (can cost 15-20x more than screenshot) ```