Agent skill

agent-browser

Browse the web for any task — research topics, read articles, interact with web apps, fill forms, take screenshots, extract data, and test web pages. Use whenever a browser would be useful, not just when the user explicitly asks.

View SKILL.md on GitHub Repository

Stars 27,176

Forks 11,781

Install this agent skill to your Project

npx add-skill https://github.com/qwibitai/nanoclaw/tree/main/container/skills/agent-browser

SKILL.md

Browser Automation with agent-browser

Quick start

bash

agent-browser open <url>        # Navigate to page
agent-browser snapshot -i       # Get interactive elements with refs
agent-browser click @e1         # Click element by ref
agent-browser fill @e2 "text"   # Fill input by ref
agent-browser close             # Close browser

Core workflow

Navigate: agent-browser open <url>
Snapshot: agent-browser snapshot -i (returns elements with refs like @e1, @e2)
Interact using refs from the snapshot
Re-snapshot after navigation or significant DOM changes

Commands

Navigation

bash

agent-browser open <url>      # Navigate to URL
agent-browser back            # Go back
agent-browser forward         # Go forward
agent-browser reload          # Reload page
agent-browser close           # Close browser

Snapshot (page analysis)

bash

agent-browser snapshot            # Full accessibility tree
agent-browser snapshot -i         # Interactive elements only (recommended)
agent-browser snapshot -c         # Compact output
agent-browser snapshot -d 3       # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector

Interactions (use @refs from snapshot)

bash

agent-browser click @e1           # Click
agent-browser dblclick @e1        # Double-click
agent-browser fill @e2 "text"     # Clear and type
agent-browser type @e2 "text"     # Type without clearing
agent-browser press Enter         # Press key
agent-browser hover @e1           # Hover
agent-browser check @e1           # Check checkbox
agent-browser uncheck @e1         # Uncheck checkbox
agent-browser select @e1 "value"  # Select dropdown option
agent-browser scroll down 500     # Scroll page
agent-browser upload @e1 file.pdf # Upload files

Get information

bash

agent-browser get text @e1        # Get element text
agent-browser get html @e1        # Get innerHTML
agent-browser get value @e1       # Get input value
agent-browser get attr @e1 href   # Get attribute
agent-browser get title           # Get page title
agent-browser get url             # Get current URL
agent-browser get count ".item"   # Count matching elements

Screenshots & PDF

bash

agent-browser screenshot          # Save to temp directory
agent-browser screenshot path.png # Save to specific path
agent-browser screenshot --full   # Full page
agent-browser pdf output.pdf      # Save as PDF

Wait

bash

agent-browser wait @e1                     # Wait for element
agent-browser wait 2000                    # Wait milliseconds
agent-browser wait --text "Success"        # Wait for text
agent-browser wait --url "**/dashboard"    # Wait for URL pattern
agent-browser wait --load networkidle      # Wait for network idle

Semantic locators (alternative to refs)

bash

agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find placeholder "Search" type "query"

Authentication with saved state

bash

# Login once
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json

# Later: load saved state
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard

Cookies & Storage

bash

agent-browser cookies                     # Get all cookies
agent-browser cookies set name value      # Set cookie
agent-browser cookies clear               # Clear cookies
agent-browser storage local               # Get localStorage
agent-browser storage local set k v       # Set value

JavaScript

bash

agent-browser eval "document.title"   # Run JavaScript

Example: Form submission

bash

agent-browser open https://example.com/form
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]

agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i  # Check result

Example: Data extraction

bash

agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e1  # Get product title
agent-browser get attr @e2 href  # Get link URL
agent-browser screenshot products.png

Maintainer

qwibitai Core maintainer

Source details

Full Name: qwibitai/nanoclaw
Branch: main
Path in repo: container/skills/agent-browser
License: MIT License
Topics: claude-code ai-agents openclaw claude-skills ai-assistant

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

qwibitai/nanoclaw

capabilities

Show what this NanoClaw instance can do — installed skills, available tools, and system info. Read-only. Use when the user asks what the bot can do, what's installed, or runs /capabilities.

27,176 11,781

Explore

qwibitai/nanoclaw

status

Quick read-only health check — session context, workspace mounts, tool availability, and task snapshot. Use when the user asks for system status or runs /status.

27,176 11,781

Explore

qwibitai/nanoclaw

slack-formatting

Format messages for Slack using mrkdwn syntax. Use when responding to Slack channels (folder starts with "slack_" or JID contains slack identifiers).

27,176 11,781

Explore

qwibitai/nanoclaw

add-voice-transcription

Add voice message transcription to NanoClaw using OpenAI's Whisper API. Automatically transcribes WhatsApp voice notes so the agent can read and respond to them.

27,176 11,781

Explore

qwibitai/nanoclaw

add-whatsapp

Add WhatsApp as a channel. Can replace other channels entirely or run alongside them. Uses QR code or pairing code for authentication.

27,176 11,781

Explore

qwibitai/nanoclaw

add-reactions

Add WhatsApp emoji reaction support — receive, send, store, and search reactions.

27,176 11,781

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Browser Automation with agent-browser

Quick start

Core workflow

Commands

Navigation

Snapshot (page analysis)

Interactions (use @refs from snapshot)

Get information

Screenshots & PDF

Wait

Semantic locators (alternative to refs)

Authentication with saved state

Cookies & Storage

JavaScript

Example: Form submission

Example: Data extraction

Recommended Agent Skills

capabilities

status

slack-formatting

add-voice-transcription

add-whatsapp

add-reactions