# cdp-skill

> Automate Chrome browser interactions via JSON passed to a Node.js CLI. Use when you need to navigate websites, fill forms, click elements, extract data, or run end-to-end browser tests. Automatic screenshots on every action. Supports accessibility snapshots for resilient element targeting.

- Author: lotreace
- Repository: lotreace/cdp-skill
- Version: 20260203211600
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/lotreace/cdp-skill
- Web: https://mule.run/skillshub/@@lotreace/cdp-skill~cdp-skill:20260203211600

---

---
name: cdp-skill
description: Automate Chrome browser interactions via JSON passed to a Node.js CLI. Use when you need to navigate websites, fill forms, click elements, extract data, or run end-to-end browser tests. Automatic screenshots on every action. Supports accessibility snapshots for resilient element targeting.
license: MIT
compatibility: Requires Chrome/Chromium (auto-launched if not running) and Node.js.
---

# CDP Browser Automation Skill

Automate Chrome browser interactions via JSON step definitions passed to a Node.js CLI.

> **See EXAMPLES.md** for full JSON examples, response shapes, and worked patterns for every step type.

## Site Profiles

Site profiles are per-domain cheatsheets stored at `~/.cdp-skill/sites/{domain}.md`. They record what the agent learned about a site: framework, quirks, stable selectors, and recipes for common flows.

> **IMPORTANT — `actionRequired` responses are MANDATORY.** When any response contains `actionRequired`, you **MUST** complete it before doing anything else. Do not skip it. Do not proceed to your main task first. Handle `actionRequired` immediately.

### How navigation uses profiles

Every `goto` and `openTab` (with URL) checks for a profile. The result appears as a **top-level field** in the response:

- **`siteProfile`** present → read it before doing anything else. It contains strategies, quirks, and recipes. Apply its `settledWhen`/`readyWhen` hooks, use its selectors, and respect its quirks.
- **`actionRequired`** present → **STOP. Create the profile NOW** before continuing your task:
  1. `snapshot` — map page structure and landmarks
  2. `pageFunction` — detect framework (e.g. `() => { return { react: !!window.__REACT, next: !!window.__NEXT_DATA__, vue: !!window.__VUE__ } }`)
  3. `writeSiteProfile` — save domain and markdown content

The profile only needs to capture what's useful: environment, quirks, stable selectors, strategies for fill/click/wait, and recipes for common flows. A minimal profile is fine — even just the environment and one quirk is valuable.

### Updating profiles after your task

After completing your goal, update the site profile with anything you learned. Discovered a quirk? Found a reliable selector? Worked out a multi-step flow? Call `writeSiteProfile` again with the improved content before closing your tab. If you didn't learn anything new, skip the update.

### readSiteProfile

`"domain"` | `{domain}` — returns `{found, domain, content}` or `{found: false, domain}`

Read a site profile without navigating. Use this to check or update a profile ad-hoc.

### writeSiteProfile

`{domain, content}` — returns `{written, path, domain}`

### Profile format

```
# example.com
Updated: 2026-02-03  |  Fingerprint: react18-next-spa

## Environment
- React 18.x, Next.js (SSR)
- SPA with pushState navigation

## Quirks
- Turbo intercepts link clicks — use settledWhen with URL check
- Search results load async — poll for .results-count before extracting

## Strategies
### fill (React controlled inputs)
Use nativeSetter.call(el, value) then dispatch input event with bubbles.

### Navigation readiness
settledWhen: () => !document.querySelector('.loading-bar')

## Regions
- mainContent: main, [role="main"]
- navigation: .nav-bar
- searchResults: #search-results

## Recipes
### Login
pipeline: find #username fill {{user}}, find #password fill {{pass}}, find #login click, waitFor () => location.pathname !== '/login'
```

Sections are optional — include what's useful. The key sections agents rely on:
- **Quirks**: pitfalls that cause failures without foreknowledge
- **Strategies**: how to fill, click, or wait on this specific site
- **Recipes**: pre-built step sequences for common flows

## Quick Start

**1. Check Chrome status** (auto-launches if needed):
`node src/cdp-skill.js '{"steps":[{"chromeStatus":true}]}'`

**2. Open a tab and navigate:**
`node src/cdp-skill.js '{"steps":[{"openTab":"https://google.com"}]}'`

**3. Use the returned tab ID for subsequent calls:**
`node src/cdp-skill.js '{"config":{"tab":"t1"},"steps":[{"click":"#btn"}]}'`

Tab IDs (t1, t2, ...) persist across CLI invocations. Stdin pipe also works: `echo '{"steps":[...]}' | node src/cdp-skill.js`

## Input / Output Schema

**Input fields:**
- `config`: `{host, port, tab, timeout, headless}` — tab is required after first call
- `steps`: array of step objects (one action per step)

**Output fields:**
- `status`: "ok" or "error"
- `tab`: short tab ID (e.g. "t1")
- `siteProfile`: full markdown content of existing profile (after goto/openTab to known site)
- `actionRequired`: `{action, domain, message}` — **MUST be handled immediately** before continuing (see Site Profiles)
- `context`: `{url, title, scroll: {y, percent}, viewport: {width, height}, activeElement?, modal?}`
- `screenshot`: path to after-screenshot (auto-captured on every visual action)
- `fullSnapshot`: path to full-page accessibility snapshot file
- `viewportSnapshot`: inline viewport-only snapshot YAML
- `changes`: `{summary, added[], removed[], changed[]}` — viewport diff on same-page interactions
- `navigated`: true when URL pathname changed
- `console`: `{errors, warnings, messages[]}` — captured errors/warnings
- `steps[]`: `{action, status}` on success; adds `{params, error, context}` on failure
- `errors[]`: only present when steps failed

**Failure diagnostics**: failed steps include `context` with `visibleButtons`, `visibleLinks`, `visibleErrors`, `nearMatches` (fuzzy matches with scores), and scroll position.

**Error types**: PARSE, VALIDATION, CONNECTION, EXECUTION. Exit code: 0 = ok, 1 = error.

## Element References

Snapshots return versioned refs like `[ref=s1e4]` — format: `s{snapshotId}e{elementNumber}`.
Use refs with `click`, `fill`, `hover`. Each snapshot increments the ID. Refs from earlier snapshots remain valid while the element is in DOM.

**Auto re-resolution**: when a ref's element leaves the DOM (React re-render, lazy-load), the system tries to re-find it by stored selector + role + name. Response includes `reResolved: true` on success.

## Auto-Waiting

| Action | Waits For |
|--------|-----------|
| `click` | visible, enabled, stable, not covered, pointer-events |
| `fill`, `type` | visible, enabled, editable |
| `hover` | visible, stable |

Use `force: true` to bypass all checks. **Auto-force**: when actionability times out but element exists, automatically retries with force (outputs `autoForced: true`).

## Action Hooks

Optional parameters on action steps to customize the step lifecycle:

- **readyWhen**: `"() => condition"` — polled until truthy **before** the action executes
- **settledWhen**: `"() => condition"` — polled until truthy **after** the action completes
- **observe**: `"() => data"` — runs after settlement, return value appears in `result.observation`

Hooks can be combined on any action step. Applies to: click, fill, press, hover, drag, selectOption, scroll.

## Step Reference

### Chrome Management

> **IMPORTANT**: Never launch Chrome manually via shell commands. Always use `chromeStatus`.

**chromeStatus**: true | {autoLaunch, headless}
  returns: {running, launched, version, port, tabs[]}

### Navigation

**goto**: "url" | {url, waitUntil}
  waitUntil: commit | domcontentloaded | load | networkidle
  response: top-level `siteProfile` or `actionRequired` (see Site Profiles)

**reload**: true | {waitUntil}

**back** / **forward**: true
  returns: {url, title} or {noHistory: true}

**waitForNavigation**: true | {timeout, waitUntil}

### Frames

**listFrames**: true
  returns: {mainFrameId, currentFrameId, frames[]}

**switchToFrame**: "selector" | index | {selector, index, name, frameId}
  returns: {frameId, url, name}

**switchToMainFrame**: true

### Waiting

**wait**: "selector" | number(ms) | {selector, hidden, minCount} | {text} | {textRegex} | {urlContains}

### Interaction

**click**: "selector" | "ref" | {ref, selector, text, x/y, selectors[]}
  options: force, jsClick, nativeOnly, doubleClick, scrollUntilVisible, searchFrames, exact, timeout, waitAfter
  hooks: readyWhen, settledWhen, observe
  returns: {clicked, method: "cdp"|"jsClick"|"jsClick-auto", navigated?, newUrl?}

**fill**: {selector|ref|label, value}
  options: clear(true), react, force, exact, timeout
  hooks: readyWhen, settledWhen, observe
  returns: {filled, navigated?, newUrl?}

**fillForm**: {"#sel": "value", ...}
  returns: {total, filled, failed, results[]}

**fillActive**: "value" | {value, clear}
  returns: {filled, tag, type, selector, valueBefore, valueAfter}

**type**: {selector, text, delay}
  returns: {selector, typed, length}

**press**: "Enter" | "Control+a" | "Meta+Shift+Enter"

**select**: "selector" | {selector, start, end}
  returns: {selector, start, end, selectedText, totalLength}

**hover**: "selector" | {selector, ref, duration, captureResult}
  hooks: readyWhen, settledWhen, observe

**drag**: {source, target}
  source/target: "selector" | "ref" | {ref, offsetX, offsetY} | {x, y}
  options: steps(10), delay(0)
  hooks: readyWhen, settledWhen, observe
  returns: {dragged, method: "html5-dnd"|"range-input"|"mouse-events", source, target}

**selectOption**: {selector, value|label|index|values}
  returns: {selected[], multiple}
  notes: uses JS to set option.selected (native dropdowns can't be clicked via CDP)

### Scrolling

**scroll**: "top" | "bottom" | "selector" | {deltaY} | {x, y}
  returns: {scrollX, scrollY}

### Data Extraction

**extract**: "selector" | {selector, type: auto|table|list|text, includeHeaders}
  returns: table → {type, headers, rows, rowCount, columnCount}; list → {type, items, itemCount}

**getDom**: true | "selector" | {selector, outer(true)}
  returns: {html, tagName, selector, length}

**getBox**: "ref" | ["ref1", "ref2"] | {refs: [...]}
  returns: {x, y, width, height, center} per ref

**refAt**: {x, y}
  returns: {ref, existing, tag, selector, clickable, role, name, box}

**elementsAt**: [{x, y}, ...]
  returns: {count, elements[]}

**elementsNear**: {x, y, radius(50), limit(20)}
  returns: {center, radius, count, elements[]}

**formState**: "selector" | {selector, includeHidden}
  returns: {selector, action, method, fields[], valid, fieldCount}

**query**: "selector" | {selector, limit(10), output} | {role, name, nameExact, nameRegex, checked, disabled, level, countOnly, refs}
  output: text | html | href | value | tag | {attribute: "data-id"}
  returns: {selector, total, showing, results[]}

**queryAll**: {"label": "selector", ...}

**inspect**: true | {selectors[], limit}
  returns: {title, url, counts, custom}

**console**: true | {level, type, since, limit, clear, stackTrace}
  returns: {total, showing, messages[]}
  notes: logs don't persist across CLI invocations

### Accessibility Snapshot

**snapshot**: true | {root, detail, mode, maxDepth, maxElements, includeText, includeFrames, pierceShadow, viewportOnly, inlineLimit, since}
  detail: summary | interactive | full(default)
  since: "s1" — returns {unchanged: true} if page hasn't changed
  returns: YAML with role, "name", states, [ref=s{N}e{M}], snapshotId
  notes: snapshots over 9KB saved to file (configurable via inlineLimit)

**snapshotSearch**: {text, pattern, role, exact, limit(10), context, near: {x, y, radius}}
  returns: {matches[], matchCount, searchedElements}
  notes: refs from snapshotSearch persist across subsequent commands

### Screenshots & PDF

Screenshots auto-captured on every visual action to `/tmp/cdp-skill/<tab>.before.png` and `.after.png`.

**pdf**: "filename" | {path, selector, landscape, printBackground, scale, paperWidth, paperHeight, margins, pageRanges, validate}
  returns: {path, fileSize, fileSizeFormatted, pageCount, dimensions}

### Dynamic Browser Execution

**pageFunction**: "() => expr" | {fn, refs(bool), timeout}
  notes: auto-wrapped as IIFE, refs passes window.__ariaRefs, return value auto-serialized, runs in current frame context

**poll**: "() => predicate" | {fn, interval(100), timeout(30000)}
  returns: {resolved, value|lastValue, elapsed}

**pipeline**: [{find+fill|click|type|check|select}, {waitFor}, {sleep}, {return}] | {steps[], timeout(30000)}
  returns: {completed, steps, results[]} or {completed: false, failedAt, error, results[]}
  notes: compiles to single async JS function, zero roundtrips between micro-ops

### Form Validation

**validate**: "selector"
  returns: {valid, message, validity}

**submit**: "selector" | {selector, reportValidity}
  returns: {submitted, valid, errors[]}

### Assertions

**assert**: {url: {contains|equals|startsWith|endsWith|matches}} | {text} | {selector, text, caseSensitive}

### Tab Management

**listTabs**: true
  returns: {count, tabs[]}

**closeTab**: "tabId"
  returns: {closed}

**openTab**: true | "url" | {url}
  returns: {opened, tab, url, navigated, viewportSnapshot, fullSnapshot, context}
  response: top-level `siteProfile` or `actionRequired` when URL provided (see Site Profiles)
  notes: REQUIRED as first step when no tab specified

### Viewport

**viewport**: "preset" | {width, height, deviceScaleFactor, mobile, hasTouch, isLandscape}
  presets: iphone-se, iphone-14, iphone-15-pro, ipad, ipad-pro-11, pixel-7, samsung-galaxy-s23, desktop, desktop-hd, macbook-pro-14
  returns: {width, height, deviceScaleFactor, mobile, hasTouch}

### Cookies

**cookies**: {get: true|[urls], name?} | {set: [{name, value, domain, expires}]} | {delete: "name", domain?} | {clear: true, domain?}
  expires formats: 30m, 1h, 7d, 1w, 1y, or Unix timestamp
  returns: get → {cookies[]}; set → {action, count}; delete/clear → {action, count}
  notes: get without URLs returns only current tab's domain cookies

### JavaScript Execution

**eval**: "expression" | {expression, await, timeout, serialize}
  returns: typed results (Number, Date, Map, Set, Element, NodeList, etc.)

### Optional Steps

Add `"optional": true` to any step to continue on failure (status becomes "skipped").

## Troubleshooting

| Issue | Solution |
|-------|----------|
| Tabs accumulating | Include `tab` in config |
| CONNECTION error | Use `chromeStatus` first |
| Chrome not found | Set `CHROME_PATH` env var |
| Element not found | Add `wait` step first |
| Clicks not working | Scroll into view first, or `force: true` |
| `back` returns noHistory | New tabs start at about:blank; navigate first |
| Select dropdown not working | Use click+click or press arrow keys |
| Type not appearing | Click input first to focus, then type |
| macOS: Chrome running but no CDP | `chromeStatus` launches new instance with CDP enabled |

## Best Practices

- **Handle `actionRequired` immediately** — when a response contains this field, complete it before doing anything else
- **Never launch Chrome directly** — always use `chromeStatus`
- **Use openTab** as your first step to create a tab; use the returned tab ID for all subsequent calls
- **Reuse only your own tabs** — other agents may share the browser
- **Update the site profile before closing** — add any quirks, selectors, or recipes you discovered
- **Close your tab when done** — `closeTab` with your tab ID
- **Discover before interacting** — use `snapshot` and `inspect` to understand the page
- **Use website navigation** — click links and submit forms; don't guess URLs
- **Prefer refs** over CSS selectors — use `snapshot` + refs for resilient targeting
- **Be persistent** — try alternative selectors, add waits, scroll first