# agent-browser > Complete browser automation using agent-browser CLI with AI-friendly semantic locators, ref system, and advanced waiting. Use when Claude needs to navigate and interact with web pages, extract structured data, fill forms, take screenshots, run E2E tests, or debug web applications. Covers semantic locators (role/text/label/placeholder), snapshot+ref workflow, session isolation with --profile, visible browser with --headed, JSON output for token efficiency, and error recovery strategies. - Author: Zeno Wang - Repository: ZenoWangzy/agent-browser-claude-plugin - Version: 20260201184135 - Stars: 1 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/ZenoWangzy/agent-browser-claude-plugin - Web: https://mule.run/skillshub/@@ZenoWangzy/agent-browser-claude-plugin~agent-browser:20260201184135 --- --- name: agent-browser description: Complete browser automation using agent-browser CLI with AI-friendly semantic locators, ref system, and advanced waiting. Use when Claude needs to navigate and interact with web pages, extract structured data, fill forms, take screenshots, run E2E tests, or debug web applications. Covers semantic locators (role/text/label/placeholder), snapshot+ref workflow, session isolation with --profile, visible browser with --headed, JSON output for token efficiency, and error recovery strategies. --- # agent-browser AI-first browser automation CLI with semantic locators, ref system, and persistent sessions. **Key Advantages over Playwright MCP:** - **93% token savings** (daemon persistence) - **AI-friendly semantic locators** (find role/text/label) - **Visible browser mode** (--headed) - **Cookie persistence** (--profile) ## Prerequisites Check First, verify agent-browser is installed: ```bash # Check if agent-browser is available if ! command -v agent-browser &> /dev/null; then echo "agent-browser not found. Installing..." npm install -g agent-browser agent-browser install # Download Chromium else echo "agent-browser is installed: $(agent-browser --version)" fi ``` ## Quick Start ### Core Workflow (Recommended - Visual Mode) ```bash # 1. Generate unique session ID SESSION="task_$(date +%s)" # 2. Navigate with visible browser agent-browser --profile $SESSION --headed open https://example.com # 3. Snapshot to get element refs agent-browser --profile $SESSION snapshot -i # 4. Click using ref ID agent-browser --profile $SESSION click @e2 # 5. Re-snapshot after page changes agent-browser --profile $SESSION snapshot -i ``` ### Alternative: Semantic Locators (No Snapshot Needed) ```bash agent-browser --profile $SESSION --headed find role button click --name "Submit" agent-browser --profile $SESSION --headed find label "Email" fill "test@test.com" agent-browser --profile $SESSION --headed find text "More information" click ``` ## Best Practices ### 1. Visual Mode (--headed) - RECOMMENDED Show browser window so users can see AI operating in real-time: ```bash # GOOD - Visible browser (user sees what's happening) agent-browser --profile mysession --headed open https://example.com # Acceptable - Headless mode (background, faster but invisible) agent-browser --profile mysession open https://example.com ``` ### 2. Session Isolation (--profile) - REQUIRED **Always use `--profile `** for state isolation and automatic cookie saving: ```bash # GOOD - Isolated session with automatic cookie persistence agent-browser --profile github_session --headed open https://github.com/login # ... login process ... # Cookies automatically saved to github_session profile # Next time - no need to login again! agent-browser --profile github_session --headed open https://github.com/dashboard # Automatically logged in! # BAD - Shared state (cookies may mix between tasks) agent-browser open https://example.com # Don't do this! ``` **Profile Benefits:** - Login sessions saved across runs - Cookies persist automatically - localStorage/sessionStorage preserved - Multiple isolated sessions can run simultaneously ### 3. Token Efficiency (--json) Use `--json` flag to reduce output tokens: ```bash agent-browser --json get title agent-browser --json get text .content agent-browser --json snapshot ``` ### 4. Error Recovery If Ref ID fails ("Node not found" or "Detach"), re-snapshot: ```bash # Ref fails agent-browser click @e2 # Error: Node not found # Recovery: Re-snapshot and get new ref agent-browser --profile $SESSION snapshot -i # Now button is @e5 agent-browser --profile $SESSION click @e5 ``` ## Core Commands | Command | Description | Example | |---------|-------------|---------| | `open ` | Navigate to URL | `open https://example.com` | | `snapshot` | Get page elements with refs | `snapshot -i` (interactive refs) | | `click` | Click element | `click @e2` or `click ".button"` | | `fill` | Fill input field | `fill @e3 "text"` or `fill "#email" "a@b.com"` | | `screenshot` | Capture screenshot | `screenshot output.png` | | `get title` | Get page title | `--json get title` | | `get text` | Get element text | `get text ".content"` | | `wait` | Wait for condition | `wait --text "Loaded"` | | `find` | Semantic locator | `find role button click` | ## Semantic Locators AI-friendly element targeting without CSS selectors: | Locator Type | Example | Use Case | |--------------|---------|----------| | `find role` | `find role button click --name "Submit"` | Buttons, links, inputs | | `find text` | `find text "More info" click` | Any visible text | | `find label` | `find label "Email" fill "a@b.com"` | Form labels | | `find placeholder` | `find placeholder "Search..." fill "query"` | Input placeholders | | `find testid` | `find testid "submit-btn" click` | data-testid attributes | **Actions:** `click`, `fill`, `hover`, `check`, `uncheck`, `press` **Modifiers:** `first`, `last`, `nth(2)` See [references/semantic-locators.md](./references/semantic-locators.md) for complete guide. ## Ref System The ref system assigns unique IDs (@e1, @e2...) to page elements: ```bash # 1. Snapshot page agent-browser --profile $SESSION snapshot -i # Output: # @e1 # @e2 # @e3 # 2. Use refs agent-browser --profile $SESSION click @e2 agent-browser --profile $SESSION fill @e3 "Hello" ``` **Important:** Refs are invalidated when page changes. Always re-snapshot after navigation or DOM updates. See [references/ref-system.md](./references/ref-system.md) for details. ## Troubleshooting | Issue | Solution | |-------|----------| | "Node not found" | Re-snapshot: `snapshot -i` | | Element not visible | Use `wait --text "..."` before clicking | | Login state lost | Use `--profile` to persist cookies | | Can't see browser | Add `--headed` flag | | Too much output | Add `--json` flag | ## Security Warning > [!CAUTION] > Access to `file:///` local paths requires explicit user authorization. > Never access local files without user consent. ## Advanced Features For detailed documentation, see: - [Semantic Locators](./references/semantic-locators.md) - Complete locator guide - [Ref System](./references/ref-system.md) - Snapshot + ref workflow - [Visual Mode](./references/visual-mode.md) - --headed mode guide - [Sessions & Profiles](./references/sessions-profiles.md) - Cookie persistence - [Advanced Waiting](./references/advanced-waiting.md) - wait commands - [Network & Storage](./references/network-storage.md) - Cookies, interception - [Debugging](./references/debugging.md) - Console, errors, trace - [Complete API](./references/api-complete.md) - All commands reference