Agent skill

agent-browser

Headless browser automation CLI optimized for AI agents with accessibility tree snapshots and ref-based element selection

View SKILL.md on GitHub Repository

Stars 1,878

Forks 294

Install this agent skill to your Project

npx add-skill https://github.com/LeoYeAI/openclaw-master-skills/tree/main/skills/agent-browser-clawdbot

Metadata

Additional technical details for this skill

clawdbot: { "emoji": "\ud83c\udf10", "homepage": "https://github.com/vercel-labs/agent-browser", "requires": { "commands": [ "agent-browser" ] } }

SKILL.md

Agent Browser Skill

Fast browser automation using accessibility tree snapshots with refs for deterministic element selection.

Why Use This Over Built-in Browser Tool

Use agent-browser when:

Automating multi-step workflows
Need deterministic element selection
Performance is critical
Working with complex SPAs
Need session isolation

Use built-in browser tool when:

Need screenshots/PDFs for analysis
Visual inspection required
Browser extension integration needed

Core Workflow

bash

# 1. Navigate and snapshot
agent-browser open https://example.com
agent-browser snapshot -i --json

# 2. Parse refs from JSON, then interact
agent-browser click @e2
agent-browser fill @e3 "text"

# 3. Re-snapshot after page changes
agent-browser snapshot -i --json

Key Commands

Navigation

bash

agent-browser open <url>
agent-browser back | forward | reload | close

Snapshot (Always use -i --json)

bash

agent-browser snapshot -i --json          # Interactive elements, JSON output
agent-browser snapshot -i -c -d 5 --json  # + compact, depth limit
agent-browser snapshot -s "#main" -i      # Scope to selector

Interactions (Ref-based)

bash

agent-browser click @e2
agent-browser fill @e3 "text"
agent-browser type @e3 "text"
agent-browser hover @e4
agent-browser check @e5 | uncheck @e5
agent-browser select @e6 "value"
agent-browser press "Enter"
agent-browser scroll down 500
agent-browser drag @e7 @e8

Get Information

bash

agent-browser get text @e1 --json
agent-browser get html @e2 --json
agent-browser get value @e3 --json
agent-browser get attr @e4 "href" --json
agent-browser get title --json
agent-browser get url --json
agent-browser get count ".item" --json

Check State

bash

agent-browser is visible @e2 --json
agent-browser is enabled @e3 --json
agent-browser is checked @e4 --json

Wait

bash

agent-browser wait @e2                    # Wait for element
agent-browser wait 1000                   # Wait ms
agent-browser wait --text "Welcome"       # Wait for text
agent-browser wait --url "**/dashboard"   # Wait for URL
agent-browser wait --load networkidle     # Wait for network
agent-browser wait --fn "window.ready === true"

Sessions (Isolated Browsers)

bash

agent-browser --session admin open site.com
agent-browser --session user open site.com
agent-browser session list
# Or via env: AGENT_BROWSER_SESSION=admin agent-browser ...

State Persistence

bash

agent-browser state save auth.json        # Save cookies/storage
agent-browser state load auth.json        # Load (skip login)

Screenshots & PDFs

bash

agent-browser screenshot page.png
agent-browser screenshot --full page.png
agent-browser pdf page.pdf

Network Control

bash

agent-browser network route "**/ads/*" --abort           # Block
agent-browser network route "**/api/*" --body '{"x":1}'  # Mock
agent-browser network requests --filter api              # View

Cookies & Storage

bash

agent-browser cookies                     # Get all
agent-browser cookies set name value
agent-browser storage local key           # Get localStorage
agent-browser storage local set key val

Tabs & Frames

bash

agent-browser tab new https://example.com
agent-browser tab 2                       # Switch to tab
agent-browser frame @e5                   # Switch to iframe
agent-browser frame main                  # Back to main

Snapshot Output Format

json

{
  "success": true,
  "data": {
    "snapshot": "...",
    "refs": {
      "e1": {"role": "heading", "name": "Example Domain"},
      "e2": {"role": "button", "name": "Submit"},
      "e3": {"role": "textbox", "name": "Email"}
    }
  }
}

Best Practices

Always use -i flag - Focus on interactive elements
Always use --json - Easier to parse
Wait for stability - agent-browser wait --load networkidle
Save auth state - Skip login flows with state save/load
Use sessions - Isolate different browser contexts
Use --headed for debugging - See what's happening

Example: Search and Extract

bash

agent-browser open https://www.google.com
agent-browser snapshot -i --json
# AI identifies search box @e1
agent-browser fill @e1 "AI agents"
agent-browser press Enter
agent-browser wait --load networkidle
agent-browser snapshot -i --json
# AI identifies result refs
agent-browser get text @e3 --json
agent-browser get attr @e4 "href" --json

Example: Multi-Session Testing

bash

# Admin session
agent-browser --session admin open app.com
agent-browser --session admin state load admin-auth.json
agent-browser --session admin snapshot -i --json

# User session (simultaneous)
agent-browser --session user open app.com
agent-browser --session user state load user-auth.json
agent-browser --session user snapshot -i --json

Installation

bash

npm install -g agent-browser
agent-browser install                     # Download Chromium
agent-browser install --with-deps         # Linux: + system deps

Credits

Skill created by Yossi Elkrief (@MaTriXy)

agent-browser CLI by Vercel Labs

Maintainer

LeoYeAI Core maintainer

Source details

Full Name: LeoYeAI/openclaw-master-skills
Branch: main
Path in repo: skills/agent-browser-clawdbot
License: MIT License
Topics: skills openclaw ai-agent agentskills myclaw curated skill-collection weekly

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

LeoYeAI/openclaw-master-skills

audit-website

Audit websites for SEO, performance, security, technical, content, and 15 other issue cateories with 230+ rules using the squirrelscan CLI. Returns LLM-optimized reports with health scores, broken links, meta tag analysis, and actionable recommendations. Use to discover and asses website or webapp issues and health.

1,878 294

Explore

LeoYeAI/openclaw-master-skills

firecrawl

Web search and scraping via Firecrawl API. Use when you need to search the web, scrape websites (including JS-heavy pages), crawl entire sites, or extract structured data from web pages. Requires FIRECRAWL_API_KEY environment variable.

1,878 294

Explore

LeoYeAI/openclaw-master-skills

computer-use

Full desktop computer use for headless Linux servers. Xvfb + XFCE virtual desktop with xdotool automation. 17 actions (click, type, scroll, screenshot, drag, etc). Unlike OpenClaw's browser tool, operates at the X11 level so websites cannot detect automation. Includes VNC for live viewing.

1,878 294

Explore

LeoYeAI/openclaw-master-skills

social-media-analyzer

Social media campaign analysis and performance tracking. Calculates engagement rates, ROI, and benchmarks across platforms. Use for analyzing social media performance, calculating engagement rate, measuring campaign ROI, comparing platform metrics, or benchmarking against industry standards.

1,878 294

Explore

LeoYeAI/openclaw-master-skills

business-growth-skills

4 production-ready business and growth skills: customer success manager with health scoring and churn prediction, sales engineer with RFP analysis, revenue operations with pipeline and GTM metrics, and contract & proposal writer. Python tools included (all stdlib-only). Works with Claude Code, Codex CLI, and OpenClaw.

1,878 294

Explore

LeoYeAI/openclaw-master-skills

contract-and-proposal-writer

Contract & Proposal Writer

1,878 294

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

Metadata

SKILL.md

Agent Browser Skill

Why Use This Over Built-in Browser Tool

Core Workflow

Key Commands

Navigation

Snapshot (Always use -i --json)

Interactions (Ref-based)

Get Information

Check State

Wait

Sessions (Isolated Browsers)

State Persistence

Screenshots & PDFs

Network Control

Cookies & Storage

Tabs & Frames

Snapshot Output Format

Best Practices

Example: Search and Extract

Example: Multi-Session Testing

Installation

Credits

Recommended Agent Skills

audit-website

firecrawl

computer-use

social-media-analyzer

business-growth-skills

contract-and-proposal-writer