Agent skill
browser-automation
Enterprise-grade browser automation using WebDriver protocol. Use when the user needs to automate web browsers, perform web scraping, test web applications, fill forms, take screenshots, monitor performance, or execute multi-step browser workflows. Supports Chrome, Firefox, and Edge with connection pooling and health management.
Install this agent skill to your Project
npx add-skill https://github.com/aiskillstore/marketplace/tree/main/skills/emillindfors/browser-automation
SKILL.md
Browser Automation Skill
This skill provides guidance for using the rust-browser-mcp server to automate web browsers through the WebDriver protocol. It enables enterprise-grade browser control with performance monitoring, multi-session support, and health management.
Overview
The rust-browser-mcp server provides 45+ MCP tools for browser automation:
Core Automation Tools (25)
- Navigation:
navigate,back,forward,refresh - Element Interaction:
click,send_keys,hover,find_element,find_elements - Information Extraction:
get_title,get_text,get_attribute,get_property,get_page_source - Advanced:
fill_and_submit_form,login_form,scroll_to_element,wait_for_element - JavaScript:
execute_script - Visual:
screenshot,resize_window,get_current_url,get_page_load_status
Performance Monitoring Tools (5)
get_performance_metrics- Page load times, resource timing, navigation datamonitor_memory_usage- Heap monitoring, memory leak detectionget_console_logs- Error detection, log filteringrun_performance_test- Automated performance analysismonitor_resource_usage- Network, FPS, CPU tracking
Driver Management Tools (7)
start_driver,stop_driver,stop_all_driverslist_managed_driversget_healthy_endpoints,refresh_driver_healthforce_cleanup_orphaned_processes
Recipe System (4)
create_recipe- Create reusable automation workflowsexecute_recipe- Run a saved recipelist_recipes- List all available recipesdelete_recipe- Remove a recipe
Setup Instructions
Prerequisites
Ensure you have at least one WebDriver installed:
- Chrome: ChromeDriver (must match Chrome version)
- Firefox: GeckoDriver
- Edge: MSEdgeDriver
Configuration for Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"browser": {
"command": "/path/to/rust-browser-mcp",
"args": ["--transport", "stdio", "--browser", "chrome"]
}
}
}
Environment Variables
| Variable | Default | Description |
|---|---|---|
WEBDRIVER_ENDPOINT |
auto |
WebDriver URL or "auto" for auto-discovery |
WEBDRIVER_HEADLESS |
true |
Run browsers in headless mode |
WEBDRIVER_PREFERRED_DRIVER |
- | Preferred browser: chrome, firefox, edge |
WEBDRIVER_CONCURRENT_DRIVERS |
firefox,chrome |
Browsers to start concurrently |
WEBDRIVER_POOL_ENABLED |
true |
Enable connection pooling |
WEBDRIVER_POOL_MAX_CONNECTIONS |
3 |
Max connections per driver type |
Usage Patterns
Basic Navigation
1. Use `navigate` with URL to load a page
2. Use `wait_for_element` to ensure page loads
3. Use `get_title` or `get_text` to verify content
Form Filling
1. Navigate to the form page
2. Use `find_element` with CSS selector to locate fields
3. Use `send_keys` to input values
4. Use `click` on submit button, or use `fill_and_submit_form` for convenience
Web Scraping
1. Navigate to target page
2. Use `find_elements` to get multiple matching elements
3. Use `get_text` or `get_attribute` to extract data
4. Use `execute_script` for complex DOM traversal
Performance Testing
1. Navigate to page under test
2. Use `run_performance_test` for automated analysis
3. Use `get_performance_metrics` for detailed timing data
4. Use `monitor_memory_usage` to detect leaks
5. Use `get_console_logs` to capture errors
Multi-Step Workflows with Recipes
1. Define a recipe with `create_recipe` including steps array
2. Each step specifies: action (tool name), arguments, optional retry logic
3. Execute with `execute_recipe` and parameters
4. Recipes support conditions and browser-specific variants
Session Management
Browser-Specific Sessions
Use session IDs prefixed with browser name for explicit browser control:
chrome_session1- Uses Chromefirefox_work- Uses Firefoxedge_testing- Uses Edge
Multi-Session Support
You can run multiple browser sessions concurrently by using different session IDs:
Session: chrome_user1 -> Opens first Chrome tab
Session: chrome_user2 -> Opens second Chrome tab
Session: firefox_admin -> Opens Firefox for different workflow
Best Practices
Error Handling
- Always use
wait_for_elementbefore interacting with dynamic content - Check
get_page_load_statusfor slow-loading pages - Use
get_console_logsto debug JavaScript errors
Performance
- Enable connection pooling (default) for better resource usage
- Reuse session IDs when possible
- Use headless mode for faster execution
Security
- Never store credentials in recipes
- Use environment variables for sensitive data
- Clear sessions after authentication workflows
Troubleshooting
Driver Not Starting
- Verify WebDriver is installed and in PATH
- Check browser version matches driver version
- Use
list_managed_driversto see status
Element Not Found
- Use browser DevTools to verify selector
- Wait for page load with
wait_for_element - Try different selector strategies (CSS, XPath)
Performance Issues
- Check
monitor_memory_usagefor leaks - Use
get_console_logsfor JavaScript errors - Consider reducing concurrent sessions
Reference Files
See companion files for detailed information:
reference/tools.md- Complete tool documentationreference/recipes.md- Recipe system guideexamples/- Example automation scripts
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
perigon-backend
Perigon ASP.NET Core + EF Core + Aspire conventions
perigon-agent
Pointers for Copilot/agents to apply Perigon conventions
perigon-angular
Angular 21+ standalone/Material/signal conventions for Perigon WebApp
fastapi-mastery
Comprehensive FastAPI development skill covering REST API creation, routing, request/response handling, validation, authentication, database integration, middleware, and deployment. Use when working with FastAPI projects, building APIs, implementing CRUD operations, setting up authentication/authorization, integrating databases (SQL/NoSQL), adding middleware, handling WebSockets, or deploying FastAPI applications. Triggered by requests involving .py files with FastAPI code, API endpoint creation, Pydantic models, or FastAPI-specific features.
context7-efficient
Token-efficient library documentation fetcher using Context7 MCP with 86.8% token savings through intelligent shell pipeline filtering. Fetches code examples, API references, and best practices for JavaScript, Python, Go, Rust, and other libraries. Use when users ask about library documentation, need code examples, want API usage patterns, are learning a new framework, need syntax reference, or troubleshooting with library-specific information. Triggers include questions like "Show me React hooks", "How do I use Prisma", "What's the Next.js routing syntax", or any request for library/framework documentation.
browser-use
Browser automation using Playwright MCP. Navigate websites, fill forms, click elements, take screenshots, and extract data. Use when tasks require web browsing, form submission, web scraping, UI testing, or any browser interaction.
Didn't find tool you were looking for?