Agent skills
browser-automation

Agent skill

browser-automation

Guidance for effective browser automation with dev-browser plugin. Use for testing local development, verifying UI changes, debugging visual issues, and automating browser tasks.

View SKILL.md on GitHub Repository

Stars 3

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/ken-cavanagh-glean/fieldkit/tree/main/plugins/browser-automation/skills/browser-automation

SKILL.md

Browser Automation Skill

Guidance for effective browser automation in Claude Code. Complements the dev-browser plugin.

Prerequisites

This skill provides guidance for using browser automation. Requires the dev-browser plugin to be installed:

bash

/plugin marketplace add sawyerhood/dev-browser
/plugin install dev-browser@sawyerhood/dev-browser

When to Use Browser Automation

Good use cases:

Testing local development (localhost, staging)
Verifying UI changes after code modifications
Debugging visual issues or user flows
Extracting data from web pages
Automating repetitive browser tasks

Poor use cases:

Tasks that require authenticated sessions you can't access
High-frequency scraping (use APIs instead)
Actions on production systems without explicit approval

Core Patterns

1. Persistent Page Sessions

Dev-browser maintains page state across interactions. Use this for multi-step workflows:

1. Navigate once to the page
2. Inspect → identify elements
3. Interact → click, type, verify
4. Don't reload unless necessary

2. LLM-Friendly DOM Inspection

Use DOM snapshots over screenshots when possible:

Snapshots are structured and searchable
Screenshots require visual interpretation
Combine both for complex debugging

Pattern:

snapshot → identify element refs → interact with refs

3. Step-by-Step for Exploration

When exploring unknown pages:

1. Take snapshot to understand structure
2. Identify interactive elements
3. Take one action
4. Verify result with new snapshot
5. Repeat

4. Full Scripts for Known Flows

When you know the exact flow:

1. Write complete interaction sequence
2. Execute in one script
3. Verify final state

Common Operations

Navigation

browser_navigate - Go to URL
browser_navigate_back - Go back
browser_snapshot - Get page structure (preferred)
browser_take_screenshot - Visual capture

Interaction

browser_click - Click element by ref
browser_type - Type into element
browser_fill_form - Fill multiple fields
browser_select_option - Select from dropdown
browser_press_key - Keyboard input

Waiting

browser_wait_for - Wait for text/element/time
Always wait after navigation or actions that trigger loading

Debugging

browser_console_messages - Check for errors
browser_network_requests - Inspect API calls

Best Practices

1. Reference-Based Interaction

Always use element refs from snapshots, not CSS selectors:

snapshot → find ref="btn-42" → click ref="btn-42"

2. Explicit Waits

After actions that cause page changes:

click → wait_for text="Success" → continue

3. Error Recovery

If an action fails:

Take new snapshot
Verify page state
Adjust approach

4. Form Filling

Use browser_fill_form for multiple fields:

fill_form([
  {name: "email", type: "textbox", ref: "...", value: "..."},
  {name: "password", type: "textbox", ref: "...", value: "..."}
])

5. Verification Pattern

After completing a flow:

1. Take final snapshot or screenshot
2. Verify expected elements present
3. Check console for errors
4. Report success/failure with evidence

Integration with Glean Workflows

Testing Agent-Generated Content

Build agent in Glean
Navigate to Glean in browser
Test agent responses
Verify output format and accuracy

Verifying Customer Deployments

Navigate to customer's Glean instance (if accessible)
Test specific agent or search functionality
Document results with screenshots

Local Development Testing

Start local dev server
Navigate to localhost
Test changes iteratively
Verify before committing

Example Workflow

Testing a login flow:

1. browser_navigate("http://localhost:3000/login")
2. browser_snapshot() → identify form elements
3. browser_fill_form([
     {name: "email", ref: "input-1", value: "test@example.com"},
     {name: "password", ref: "input-2", value: "testpass"}
   ])
4. browser_click(ref: "submit-btn")
5. browser_wait_for(text: "Dashboard")
6. browser_snapshot() → verify logged in state
7. Report: "Login successful - dashboard loaded"

Skill version: 1.0.0 Requires: dev-browser plugin -- Axon | 2026-01-01

Maintainer

ken-cavanagh-glean Core maintainer

Source details

Full Name: ken-cavanagh-glean/fieldkit
Branch: main
Path in repo: plugins/browser-automation/skills/browser-automation

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

ken-cavanagh-glean/fieldkit

brief

Get a pre-meeting briefing for any account. Usage: /brief {account name}. Returns checklist progress, hours remaining, last interaction, missed comms, and open to-dos.

3 0

Explore

ken-cavanagh-glean/fieldkit

agent-spec-generator

Design and generate Glean Agent specifications. Use when creating new agents, speccing out agent requirements, or generating JSON for Agent Builder import. Triggers on: 'create an agent', 'design an agent', 'agent spec', 'build an agent for [use case]'.

3 0

Explore

ken-cavanagh-glean/fieldkit

glean-mcp

Your work knowledge agent. Use Glean chat to answer any question about the user's company, accounts, colleagues, meetings, documents, or work history. Glean synthesizes across 100+ enterprise apps and always cites sources.

3 0

Explore

ken-cavanagh-glean/fieldkit

context-engineering-collection

A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems. Use when building, optimizing, or debugging agent systems that require effective context management.

3 0

Explore

ken-cavanagh-glean/fieldkit

tool-design

Design tools that agents can use effectively, including when to reduce tool complexity. Use when creating, optimizing, or reducing agent tool sets.

3 0

Explore

ken-cavanagh-glean/fieldkit

memory-systems

Design and implement memory architectures for agent systems. Use when building agents that need to persist state across sessions, maintain entity consistency, or reason over structured knowledge.

3 0

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Browser Automation Skill

Prerequisites

When to Use Browser Automation

Core Patterns

1. Persistent Page Sessions

2. LLM-Friendly DOM Inspection

3. Step-by-Step for Exploration

4. Full Scripts for Known Flows

Common Operations

Navigation

Interaction

Waiting

Debugging

Best Practices

1. Reference-Based Interaction

2. Explicit Waits

3. Error Recovery

4. Form Filling

5. Verification Pattern

Integration with Glean Workflows

Testing Agent-Generated Content

Verifying Customer Deployments

Local Development Testing

Example Workflow

Recommended Agent Skills

brief

agent-spec-generator

glean-mcp

context-engineering-collection

tool-design

memory-systems