web-eval-agent

web-eval-agent

Autonomous browser-based agent for web app testing and debugging, integrated with MCP.

1,175
Stars
97
Forks
1,175
Watchers
10
Issues
web-eval-agent launches an MCP-compliant server that enables automated UX evaluation and debugging of web applications using a browser-driven agent. It autonomously navigates web apps, captures network and console logs, and generates detailed reports. The solution integrates with code editors such as Cursor, enabling seamless in-editor tool invocation and efficient end-to-end testing workflows. Features include browser state management, rich context reporting, and API key-based authentication.

Key Features

Automated browser interaction for app testing
Captures network traffic and console logs
Generates detailed UX evaluation reports
Supports autonomous debugging workflows
Reusable authenticated browser sessions
IDE chat integration for tool invocation
API key-based authentication and configuration
Customizable test tasks via natural language
Easy setup with one-click installer or manual installation
Real-time and chronological test reporting

Use Cases

Automated end-to-end testing of web applications
Debugging UX issues during development
Continuous monitoring of web app flows in CI/CD pipelines
Validating feature implementations post-deployment
Capturing and analyzing network requests for troubleshooting
Detecting and reporting console errors in real time
Performing authenticated flows without manual sign-in on each run
Seamless web app evaluation directly from code editors
Creating automated regression test scripts using natural language
Providing QA reports for development teams

README

πŸš€ operative.sh web-eval-agent MCP Server

Let the coding agent debug itself, you've got better things to do.

Demo

πŸ”₯ Supercharge Your Debugging

operative.sh's MCP Server launches a browser-use powered agent to autonomously execute and debug web apps directly in your code editor.

⚑ Features

  • 🌐 Navigate your webapp using BrowserUse (2x faster with operative backend)
  • πŸ“Š Capture network traffic - requests are intelligently filtered and returned into the context window
  • 🚨 Collect console errors - captures logs & errors
  • πŸ€– Autonomous debugging - the Cursor agent calls the web QA agent mcp server to test if the code it wrote works as epected end-to-end.

🧰 MCP Tool Reference

Tool Purpose
web_eval_agent πŸ€– Automated UX evaluator that drives the browser, captures screenshots, console & network logs, and returns a rich UX report.
setup_browser_state πŸ”’ Opens an interactive (non-headless) browser so you can sign in once; the saved cookies/local-storage are reused by subsequent web_eval_agent runs.

Key arguments

  • web_eval_agent

    • url (required) – address of the running app (e.g. http://localhost:3000)
    • task (required) – natural-language description of what to test ("run through the signup flow and note any UX issues")
    • headless_browser (optional, default false) – set to true to hide the browser window
  • setup_browser_state

    • url (optional) – page to open first (handy to land directly on a login screen)

You can trigger these tools straight from your IDE chat, for example:

bash
Evaluate my app at http://localhost:3000 – run web_eval_agent with the task "Try the full signup flow and report UX issues".

🏁 Quick Start

Easy Setup with One-Click Integration

  1. Get your API key (free) - when you create your API key, you'll see:
    • "Add to Cursor" button with a deeplink for instant Cursor installation
    • Prefilled Claude Code command with your API key automatically included

Manual Setup (macOS/Linux)

  1. Pre-requisites (typically not needed):
  • brew: /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
  • npm: (brew install npm)
  • jq: brew install jq
  1. Run the installer after getting an api key (free)
bash
curl -LSf https://operative.sh/install.sh -o install.sh && bash install.sh && rm install.sh
  1. Visit your favorite IDE and restart to apply the changes
  2. Send a prompt in chat mode to call the web eval agent tool! e.g.
bash
Test my app on http://localhost:3000. Use web-eval-agent.

πŸ› οΈ Manual Installation

  1. Get your API key at operative.sh/mcp
  2. Install uv
bash
curl -LsSf https://astral.sh/uv/install.sh | sh
  1. Source environment variables after installing UV

Mac

source ~/.zshrc

Linux

source ~/.bashrc 
  1. Install playwright:
bash
npm install -g chromium playwright && uvx --with playwright playwright install --with-deps
  1. Add below JSON to your relevant code editor with api key
  2. Restart your code editor

πŸ”ƒ Updating

  • uv cache clean
  • refresh MCP server
json
    "web-eval-agent": {
      "command": "uvx",
      "args": [
        "--refresh-package",
        "webEvalAgent",
        "--from",
        "git+https://github.com/Operative-Sh/web-eval-agent.git",
        "webEvalAgent"
      ],
      "env": {
        "OPERATIVE_API_KEY": "<YOUR_KEY>"
      }
    }

Operative Discord Server

πŸ› οΈ Manual Installation (Mac + Cursor/Cline/Windsurf)

  1. Get your API key at operative.sh/mcp
  2. Install uv
bash
curl -LsSf https://astral.sh/uv/install.sh | sh)
  1. Install playwright:
bash
npm install -g chromium playwright && uvx --with playwright playwright install --with-deps
  1. Add below JSON to your relevant code editor with api key
  2. Restart your code editor

Manual Installation (Windows + Cursor/Cline/Windsurf)

We're refining this, please open an issue if you have any issues!

  1. Do all this in your code editor terminal
  2. curl -LSf https://operative.sh/install.sh -o install.sh && bash install.sh && rm install.sh
  3. Get your API key at operative.sh/mcp
  4. Install uv (curl -LsSf https://astral.sh/uv/install.sh | sh)
  5. uvx --from git+https://github.com/Operative-Sh/web-eval-agent.git playwright install
  6. Restart code editor

🚨 Issues

  • Updates aren't being received in code editors, update or reinstall for latest version: Run uv cache clean for latest
  • Any issues feel free to open an Issue on this repo or in the discord!
  • 5/5 - static apps without changes weren't screencasting, fixed! uv clean + restart to get fix

Changelog

  • 4/29 - Agent overlay update - pause/play/stop agent run in the browser

πŸ“‹ Example MCP Server Output Report

text
πŸ“Š Web Evaluation Report for http://localhost:5173 complete!
πŸ“ Task: Test the API-key deletion flow by navigating to the API Keys section, deleting a key, and judging the UX.

πŸ” Agent Steps
  πŸ“ 1. Navigate β†’ http://localhost:5173
  πŸ“ 2. Click     "Login"        (button index 2)
  πŸ“ 3. Click     "API Keys"     (button index 4)
  πŸ“ 4. Click     "Create Key"   (button index 9)
  πŸ“ 5. Type      "Test API Key" (input index 2)
  πŸ“ 6. Click     "Done"         (button index 3)
  πŸ“ 7. Click     "Delete"       (button index 10)
  πŸ“ 8. Click     "Delete"       (confirm index 3)
🏁 Flow tested successfully – UX felt smooth and intuitive.

πŸ–₯️ Console Logs (10)
  1. [debug] [vite] connecting…
  2. [debug] [vite] connected.
  3. [info]  Download the React DevTools …
     …

🌐 Network Requests (10)
  1. GET /src/pages/SleepingMasks.tsx                   304
  2. GET /src/pages/MCPRegistryRegistry.tsx             304
     …

⏱️ Chronological Timeline
  01:16:23.293 πŸ–₯️ Console [debug] [vite] connecting…
  01:16:23.303 πŸ–₯️ Console [debug] [vite] connected.
  01:16:23.312 ➑️ GET /src/pages/SleepingMasks.tsx
  01:16:23.318 ⬅️ 304 /src/pages/SleepingMasks.tsx
     …
  01:17:45.038 πŸ€– 🏁 Flow finished – deletion verified
  01:17:47.038 πŸ€– πŸ“‹ Conclusion repeated above
πŸ‘οΈ  See the "Operative Control Center" dashboard for live logs.

Star History

Star History Chart


Built with <3 @ operative.sh

Star History

Star History Chart

Repository Owner

Operative-Sh
Operative-Sh

Organization

Repository Details

Language Python
Default Branch main
Size 118,165 KB
Contributors 4
License Apache License 2.0
MCP Verified Sep 5, 2025

Programming Languages

Python
68.38%
HTML
19.31%
JavaScript
12.31%

Tags

Topics

debugging debugging-tool mcp mcp-server modelcontextprotocol playwright qa vibe-coding vibe-testing

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

We respect your privacy. Unsubscribe at any time.

Related MCPs

Discover similar Model Context Protocol servers

  • awslabs/mcp

    awslabs/mcp

    Specialized MCP servers for seamless AWS integration in AI and development environments.

    AWS MCP Servers is a suite of specialized servers implementing the open Model Context Protocol (MCP) to bridge large language model (LLM) applications with AWS services, tools, and data sources. It provides a standardized way for AI assistants, IDEs, and developer tools to access up-to-date AWS documentation, perform cloud operations, and automate workflows with context-aware intelligence. Featuring a broad catalog of domain-specific servers, quick installation for popular platforms, and both local and remote deployment options, it enhances cloud-native development, infrastructure management, and workflow automation for AI-driven tools. The project includes Docker, Lambda, and direct integration instructions for environments such as Amazon Q CLI, Cursor, Windsurf, Kiro, and VS Code.

    • ⭐ 6,220
    • MCP
    • awslabs/mcp
  • cloudflare/mcp-server-cloudflare

    cloudflare/mcp-server-cloudflare

    Connect Cloudflare services to Model Context Protocol (MCP) clients for AI-powered management.

    Cloudflare MCP Server enables integration between Cloudflare's suite of services and clients using the Model Context Protocol (MCP). It provides multiple specialized servers that allow AI models to access, analyze, and manage configurations, logs, analytics, and other features across Cloudflare's platform. Users can leverage natural language interfaces in compatible MCP clients to read data, gain insights, and perform automated actions on their Cloudflare accounts. This project aims to streamline the orchestration of security, development, monitoring, and infrastructure tasks through standardized MCP connections.

    • ⭐ 2,919
    • MCP
    • cloudflare/mcp-server-cloudflare
  • Didn't find tool you were looking for?

    Be as detailed as possible for better results