Agent skill

remote-browser

Controls a local browser from a sandboxed remote machine. Use when the agent is running in a sandbox (no GUI) and needs to navigate websites, interact with web pages, fill forms, take screenshots, or expose local dev servers via tunnels.

View SKILL.md on GitHub Repository

Stars 85,667

Forks 9,907

Install this agent skill to your Project

npx add-skill https://github.com/browser-use/browser-use/tree/main/skills/remote-browser

SKILL.md

Browser Automation for Sandboxed Agents

This skill is for agents running on sandboxed remote machines (cloud VMs, CI, coding agents) that need to control a headless browser.

Prerequisites

bash

browser-use doctor    # Verify installation

For setup details, see https://github.com/browser-use/browser-use/blob/main/browser_use/skill_cli/README.md

Core Workflow

Navigate: browser-use open <url> — starts headless browser if needed
Inspect: browser-use state — returns clickable elements with indices
Interact: use indices from state (browser-use click 5, browser-use input 3 "text")
Verify: browser-use state or browser-use screenshot to confirm
Repeat: browser stays open between commands
Cleanup: browser-use close when done

Browser Modes

bash

browser-use open <url>                                    # Default: headless Chromium
browser-use cloud connect                                 # Provision cloud browser and connect
browser-use --connect open <url>                          # Auto-discover running Chrome via CDP
browser-use --cdp-url ws://localhost:9222/... open <url>  # Connect via CDP URL

Commands

bash

# Navigation
browser-use open <url>                    # Navigate to URL
browser-use back                          # Go back in history
browser-use scroll down                   # Scroll down (--amount N for pixels)
browser-use scroll up                     # Scroll up
browser-use switch <tab>                  # Switch to tab by index
browser-use close-tab [tab]              # Close tab (current if no index)

# Page State — always run state first to get element indices
browser-use state                         # URL, title, clickable elements with indices
browser-use screenshot [path.png]         # Screenshot (base64 if no path, --full for full page)

# Interactions — use indices from state
browser-use click <index>                 # Click element by index
browser-use click <x> <y>                 # Click at pixel coordinates
browser-use type "text"                   # Type into focused element
browser-use input <index> "text"          # Click element, then type
browser-use keys "Enter"                  # Send keyboard keys (also "Control+a", etc.)
browser-use select <index> "option"       # Select dropdown option
browser-use upload <index> <path>         # Upload file to file input
browser-use hover <index>                 # Hover over element
browser-use dblclick <index>              # Double-click element
browser-use rightclick <index>            # Right-click element

# Data Extraction
browser-use eval "js code"                # Execute JavaScript, return result
browser-use get title                     # Page title
browser-use get html [--selector "h1"]    # Page HTML (or scoped to selector)
browser-use get text <index>              # Element text content
browser-use get value <index>             # Input/textarea value
browser-use get attributes <index>        # Element attributes
browser-use get bbox <index>              # Bounding box (x, y, width, height)

# Wait
browser-use wait selector "css"           # Wait for element (--state visible|hidden|attached|detached, --timeout ms)
browser-use wait text "text"              # Wait for text to appear

# Cookies
browser-use cookies get [--url <url>]     # Get cookies (optionally filtered)
browser-use cookies set <name> <value>    # Set cookie (--domain, --secure, --http-only, --same-site, --expires)
browser-use cookies clear [--url <url>]   # Clear cookies
browser-use cookies export <file>         # Export to JSON
browser-use cookies import <file>         # Import from JSON

# Python — persistent session with browser access
browser-use python "code"                 # Execute Python (variables persist across calls)
browser-use python --file script.py       # Run file
browser-use python --vars                 # Show defined variables
browser-use python --reset                # Clear namespace

# Session
browser-use close                         # Close browser and stop daemon
browser-use sessions                      # List active sessions
browser-use close --all                   # Close all sessions

The Python browser object provides: browser.url, browser.title, browser.html, browser.goto(url), browser.back(), browser.click(index), browser.type(text), browser.input(index, text), browser.keys(keys), browser.upload(index, path), browser.screenshot(path), browser.scroll(direction, amount), browser.wait(seconds).

Tunnels

Expose local dev servers to the browser via Cloudflare tunnels.

bash

browser-use tunnel <port>                 # Start tunnel (idempotent)
browser-use tunnel list                   # Show active tunnels
browser-use tunnel stop <port>            # Stop tunnel
browser-use tunnel stop --all             # Stop all tunnels

Command Chaining

Commands can be chained with &&. The browser persists via the daemon, so chaining is safe and efficient.

bash

browser-use open https://example.com && browser-use state
browser-use input 5 "user@example.com" && browser-use input 6 "password" && browser-use click 7

Chain when you don't need intermediate output. Run separately when you need to parse state to discover indices first.

Common Workflows

Exposing Local Dev Servers

bash

python -m http.server 3000 &                      # Start dev server
browser-use tunnel 3000                            # → https://abc.trycloudflare.com
browser-use open https://abc.trycloudflare.com     # Browse the tunnel

Tunnels are independent of browser sessions and persist across browser-use close.

Global Options

Option	Description
`--headed`	Show browser window
`--connect`	Auto-discover running Chrome via CDP
`--cdp-url <url>`	Connect via CDP URL (`http://` or `ws://`)
`--session NAME`	Target a named session (default: "default")
`--json`	Output as JSON

Tips

Always run state first to see available elements and their indices
Sessions persist — browser stays open between commands until you close it
Tunnels are independent — they persist across browser-use close
tunnel is idempotent — calling again for the same port returns the existing URL

Troubleshooting

Browser won't start? browser-use close then retry. Run browser-use doctor to check.
Element not found? browser-use scroll down then browser-use state
Tunnel not working? which cloudflared to check, browser-use tunnel list to see active tunnels

Cleanup

bash

browser-use close                         # Close browser session
browser-use tunnel stop --all             # Stop tunnels (if any)

Maintainer

browser-use Core maintainer

Source details

Full Name: browser-use/browser-use
Branch: main
Path in repo: skills/remote-browser
License: MIT License
Topics: ai-agents llm python playwright ai-tools browser-automation browser-use

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

browser-use/browser-use

open-source

Documentation reference for writing Python code using the browser-use open-source library. Use this skill whenever the user needs help with Agent, Browser, or Tools configuration, is writing code that imports from browser_use, asks about @sandbox deployment, supported LLM models, Actor API, custom tools, lifecycle hooks, MCP server setup, or monitoring/observability with Laminar or OpenLIT. Also trigger for questions about browser-use installation, prompting strategies, or sensitive data handling. Do NOT use this for Cloud API/SDK usage or pricing — use the cloud skill instead. Do NOT use this for directly automating a browser via CLI commands — use the browser-use skill instead.

85,667 9,907

Explore

browser-use/browser-use

browser-use

Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, or extract information from web pages.

85,667 9,907

Explore

browser-use/browser-use

cloud

Documentation reference for using Browser Use Cloud — the hosted API and SDK for browser automation. Use this skill whenever the user needs help with the Cloud REST API (v2 or v3), browser-use-sdk (Python or TypeScript), X-Browser-Use-API-Key authentication, cloud sessions, browser profiles, profile sync, CDP WebSocket connections, stealth browsers, residential proxies, CAPTCHA handling, webhooks, workspaces, skills marketplace, liveUrl streaming, pricing, or integration patterns (chat UI, subagent, adding browser tools to existing agents). Also trigger for questions about n8n/Make/Zapier integration, Playwright/ Puppeteer/Selenium on cloud infrastructure, or 1Password vault integration. Do NOT use this for the open-source Python library (Agent, Browser, Tools config) — use the open-source skill instead.

85,667 9,907

Explore

sickn33/antigravity-awesome-skills

obsidian-clipper-template-creator

Guide for creating templates for the Obsidian Web Clipper. Use when you want to create a new clipping template, understand available variables, or format clipped content.

28,421 4,766

Explore

sickn33/antigravity-awesome-skills

claude-code-expert

Especialista profundo em Claude Code - CLI da Anthropic. Maximiza produtividade com atalhos, hooks, MCPs, configuracoes avancadas, workflows, CLAUDE.md, memoria, sub-agentes, permissoes e integracao com ecossistemas.

28,421 4,766

Explore

sickn33/antigravity-awesome-skills

lex

Centralized 'Truth Engine' for cross-jurisdictional legal context (US, EU, CA) and contract scaffolding.

28,421 4,766

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Browser Automation for Sandboxed Agents

Prerequisites

Core Workflow

Browser Modes

Commands

Tunnels

Command Chaining

Common Workflows

Exposing Local Dev Servers

Global Options

Tips

Troubleshooting

Cleanup

Recommended Agent Skills

open-source

browser-use

cloud

obsidian-clipper-template-creator

claude-code-expert

lex