# agent-browser > Use when a task needs headless browser automation (navigate, click, fill forms, extract text, take screenshots) and Playwright-based tooling should be avoided; prefer the agent-browser CLI. - Author: Rahul Maurya - Repository: rahul123gautam/my-crazy-skills - Version: 20260206170949 - Stars: 1 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/rahul123gautam/my-crazy-skills - Web: https://mule.run/skillshub/@@rahul123gautam/my-crazy-skills~agent-browser:20260206170949 --- --- name: agent-browser description: Use when a task needs headless browser automation (navigate, click, fill forms, extract text, take screenshots) and Playwright-based tooling should be avoided; prefer the agent-browser CLI. --- # agent-browser ## Overview Use `agent-browser` (CLI) for deterministic, scriptable headless browser automation via snapshots and stable element refs (`@e1`, `@e2`, ...). ## Core Workflow 1. Open a page: `agent-browser open ` 2. Get a ref-based accessibility snapshot: `agent-browser snapshot -i` 3. Interact using refs (preferred) or selectors: - Click: `agent-browser click @e2` - Fill: `agent-browser fill @e3 "text"` - Press: `agent-browser press Enter` 4. Extract info: - Visible text: `agent-browser get text @e1` - URL/title: `agent-browser get url` / `agent-browser get title` 5. Capture artifacts: - Screenshot: `agent-browser screenshot --full path/to.png` - Trace: `agent-browser trace start` / `agent-browser trace stop path/to-trace.json.gz` ## Quick Reference ### Common Flags - Reuse a session across commands: `--session ` (or `AGENT_BROWSER_SESSION=`) - Machine-readable output: `--json` - Debug output: `--debug` - Run with a visible window (debugging): `--headed` ### Playwright → agent-browser Mapping | Intent | Playwright concept | `agent-browser` command | | --- | --- | --- | | Navigate | `page.goto(url)` | `agent-browser open ` | | Snapshot | a11y snapshot / locator discovery | `agent-browser snapshot -i` | | Click | `locator.click()` | `agent-browser click ` or `agent-browser click @eN` | | Fill | `locator.fill()` | `agent-browser fill ` or `agent-browser fill @eN ` | | Type | `locator.type()` | `agent-browser type ` | | Press key | `page.keyboard.press()` | `agent-browser press ` | | Wait for element | `page.waitForSelector()` | `agent-browser wait ` | | Read text | `locator.innerText()` | `agent-browser get text ` | | Screenshot | `page.screenshot()` | `agent-browser screenshot [--full] [path]` | | Evaluate JS | `page.evaluate()` | `agent-browser eval ` | ## Examples ### Login/Form Flow (ref-first) ```bash export AGENT_BROWSER_SESSION="acme-login" agent-browser open "https://example.com/login" agent-browser snapshot -i agent-browser fill @e12 "user@example.com" agent-browser fill @e13 "correct horse battery staple" agent-browser click @e14 agent-browser wait @e20 agent-browser get text @e20 ``` ### Auth Headers (origin-scoped) ```bash agent-browser open "https://example.com" agent-browser --headers '{"Authorization":"Bearer YOUR_TOKEN"}' open "https://example.com/protected" agent-browser snapshot -i --json ``` ## Common Mistakes - Not using `--session` when a multi-step flow spans multiple commands. - Clicking with brittle CSS selectors when a `snapshot -i` ref is available. - Forgetting `--json` when downstream steps require structured parsing. - In sandboxed environments, the `agent-browser` daemon may fail to start due to restricted writes to user cache/home; rerun with the necessary filesystem permissions. - If the browser can’t launch, run `agent-browser install` (may require network access).