MCP Server for the RAG Web Browser Actor

MCP Server for the RAG Web Browser Actor

Local MCP server enabling LLMs to browse and extract web content via RAG Web Browser integration.

194
Stars
25
Forks
194
Watchers
0
Issues
Implements a Model Context Protocol (MCP) server designed for integration with LLMs and Retrieval-Augmented Generation (RAG) pipelines, providing automated web search and web page extraction capabilities. Runs locally and interacts with the RAG Web Browser Actor in Standby mode, responding to queries by fetching, scraping, and returning cleaned content from the web as Markdown. Supports Google Search queries and direct URL fetching through a standardized 'search' tool interface, offering multiple output formats and programmable arguments. Deprecated in favor of mcp.apify.com but illustrates MCP server use for local web browsing integrations.

Key Features

Implements MCP (Model Context Protocol) server for AI clients
Supports local operation with stdio communication
Integrates with RAG Web Browser Actor in Standby mode
Enables both Google Search and direct URL page scraping
Returns cleaned content as Markdown, HTML, or text
Configurable tool arguments (e.g., max results, output formats, timeouts)
Option to select scraping mode: browser-based or raw HTTP
Migration path to hosted mcp.apify.com service
OAuth and HTTP integration supported in alternatives
Real-time streaming support via Server-Sent Events

Use Cases

Enabling LLM-powered agents to search the web and return summarized results
Retrieving content from specific pages for knowledge augmentation in chatbots
Building conversational interfaces with up-to-date web information access
Integrating web browsing capabilities into tools like Claude Desktop or VS Code
Automating research tasks for AI pipelines using standardized protocol
Rapidly prototyping RAG workflows with local or remote MCP servers
Extracting readable Markdown content from dynamic and static web pages
Providing context and background information to LLMs from real-world sources
Testing and developing custom AI clients requiring web search tools
Enhancing data retrieval for enterprise question-answering systems

README

MCP Server for the RAG Web Browser Actor 🌐

Implementation of an MCP server for the RAG Web Browser Actor. This Actor serves as a web browser for large language models (LLMs) and RAG pipelines, similar to a web search in ChatGPT.

This MCP server is deprecated in favor of mcp.apify.com

For the same functionality and much more, please use one of these alternatives:

🚀 Recommended: use mcp.apify.com

The easiest way to get the same web browsing capabilities is to use mcp.apify.com with default settings.

Benefits:

  • ✅ No local setup required
  • ✅ Always up-to-date
  • ✅ Access to 6,000+ Apify Actors (including RAG Web Browser)
  • ✅ OAuth support for easy connection
  • ✅ Dynamic tool discovery

Quick Setup:

  1. Go to https://mcp.apify.com
  2. Authorize the client (Claude, VS Code, etc.)
  3. Copy the generated MCP server configuration (or use OAuth flow if supported)
  4. Start using browsing & other tools immediately

🌐 Alternative: direct RAG Web Browser integration

You can also call the RAG Web Browser Actor directly via its HTTP/SSE interface.

Benefits:

  • ✅ Direct integration without mcp.apify.com
  • ✅ Real-time streaming via Server-Sent Events
  • ✅ Full control over the integration
  • ✅ No additional dependencies

Docs: Actor Documentation


🎯 What does this MCP server do?

This server is specifically designed to provide fast responses to AI agents and LLMs, allowing them to interact with the web and extract information from web pages. It runs locally and communicates with the RAG Web Browser Actor in Standby mode, sending search queries and receiving extracted web content in response.

  • Web Search: Query Google Search, scrape top N URLs, and return cleaned content as Markdown
  • Single URL Fetching: Fetch a specific URL and return its content as Markdown
  • Local MCP Integration: Standard input/output (stdio) communication with AI clients

🧱 Components

Tools

  • name: search description: Query Google Search OR fetch a direct URL and return cleaned page contents. arguments:
    • query (string, required): Search keywords or a full URL. Advanced Google operators supported.
    • maxResults (number, optional, default: 1): Max organic results to fetch (ignored when query is a URL).
    • scrapingTool (string, optional, default: raw-http): One of browser-playwright | raw-http.
      • raw-http: Fast (no JS execution) – good for static pages.
      • browser-playwright: Handles JS-heavy sites – slower, more robust.
    • outputFormats (array of strings, optional, default: [markdown]): One or more of text, markdown, html.
    • requestTimeoutSecs (number, optional, default: 40, min 1 max 300): Total server-side AND client wait budget. A local abort is enforced.

🔄 Migration Guide

From Local MCP Server to mcp.apify.com

Before (Deprecated local server):

json
{
  "mcpServers": {
    "rag-web-browser": {
      "command": "npx",
      "args": ["@apify/mcp-server-rag-web-browser"],
      "env": {
        "APIFY_TOKEN": "your-apify-api-token"
      }
    }
  }
}

After (Recommended Apify server):

json
{
  "mcpServers": {
    "apify": {
      "command": "npx",
      "args": ["@apify/actors-mcp-server"],
      "env": {
        "APIFY_TOKEN": "your-apify-api-token"
      }
    }
  }
}

Or use the hosted endpoint: https://mcp.apify.com (when your client supports HTTP transport / remote MCP).

MCP clients

🛠️ Development

Prerequisites

  • Node.js (v18 or higher)
  • Apify API Token (APIFY_TOKEN)

Clone & install:

bash
git clone https://github.com/apify/mcp-server-rag-web-browser.git
cd mcp-server-rag-web-browser
npm install

Build

bash
npm install
npm run build

Debugging

Since MCP servers operate over standard input/output (stdio), debugging can be challenging. For the best debugging experience, use the MCP Inspector.

You can launch the MCP Inspector via npm with this command:

bash
export APIFY_TOKEN=your-apify-api-token
npx @modelcontextprotocol/inspector node dist/index.js

Upon launching, the Inspector will display a URL that you can access in your browser to begin debugging.

📖 Learn more

This repository is maintained for archival purposes only. Please use the recommended alternatives above for active development.

Star History

Star History Chart

Repository Owner

apify
apify

Organization

Repository Details

Language JavaScript
Default Branch main
Size 179 KB
Contributors 2
License Apache License 2.0
MCP Verified Nov 12, 2025

Programming Languages

JavaScript
68.03%
TypeScript
31.97%

Tags

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

We respect your privacy. Unsubscribe at any time.

Related MCPs

Discover similar Model Context Protocol servers

  • Scrapeless MCP Server

    Scrapeless MCP Server

    A real-time web integration layer for LLMs and AI agents built on the open MCP standard.

    Scrapeless MCP Server is a powerful integration layer enabling large language models, AI agents, and applications to interact with the web in real time. Built on the open Model Context Protocol, it facilitates seamless connections between models like ChatGPT, Claude, and tools such as Cursor to external web capabilities, including Google services, browser automation, and advanced data extraction. The system supports multiple transport modes and is designed to provide dynamic, real-world context to AI workflows. Robust scraping, dynamic content handling, and flexible export formats are core parts of the feature set.

    • 57
    • MCP
    • scrapeless-ai/scrapeless-mcp-server
  • mcp-local-rag

    mcp-local-rag

    Local RAG server for web search and context injection using Model Context Protocol.

    mcp-local-rag is a local server implementing the Model Context Protocol (MCP) to provide retrieval-augmented generation (RAG) capabilities. It performs live web search, extracts relevant context using Google's MediaPipe Text Embedder, and supplies the information to large language models (LLMs) for enhanced, up-to-date responses. The tool is designed for easy local deployment, requiring no external APIs, and is compatible with multiple MCP clients. Security audits are available, and integration is demonstrated across several LLM platforms.

    • 89
    • MCP
    • nkapila6/mcp-local-rag
  • MCP Web Research Server

    MCP Web Research Server

    Bring real-time web research and Google search capabilities into Claude using MCP.

    MCP Web Research Server acts as a Model Context Protocol (MCP) server, seamlessly integrating web research functionalities with Claude Desktop. It enables Google search, webpage content extraction, research session tracking, and screenshot capture, all accessible directly from Claude. The server supports interactive and guided research sessions, exposing session data and screenshots as MCP resources for enhanced context-aware AI interactions.

    • 284
    • MCP
    • mzxrai/mcp-webresearch
  • Fetcher MCP

    Fetcher MCP

    Intelligent web content fetching and extraction using Playwright.

    Fetcher MCP is a server that fetches and extracts web page content using the Playwright headless browser while supporting the Model Context Protocol. It intelligently processes dynamic web pages with JavaScript, employs the Readability algorithm to extract main content, and supports output in both HTML and Markdown formats. Designed for seamless integration with AI model environments, it offers robust parallel processing, resource optimization, and flexible deployment options including Docker.

    • 906
    • MCP
    • jae-jae/fetcher-mcp
  • WebScraping.AI MCP Server

    WebScraping.AI MCP Server

    MCP server for advanced web scraping and AI-driven data extraction

    WebScraping.AI MCP Server implements the Model Context Protocol to provide web data extraction and question answering functionalities. It integrates with WebScraping.AI to offer robust tools for retrieving, rendering, and parsing web content, including structured data and natural language answers from web pages. It supports JavaScript rendering, proxy management, device emulation, and custom extraction configurations, making it suitable for both individual and team deployments in AI-assisted workflows.

    • 33
    • MCP
    • webscraping-ai/webscraping-ai-mcp-server
  • G-Search MCP

    G-Search MCP

    Parallel Google search server with MCP support for structured AI context.

    G-Search MCP is a server for executing parallel Google searches optimized for use with AI tools through the Model Context Protocol. It efficiently handles multiple queries at once, simulates realistic user behavior to evade detection, and returns well-structured JSON responses. The server is configurable, supports parameter tuning like search limits and timeouts, and is designed for seamless integration in AI-driven environments such as Claude Desktop.

    • 224
    • MCP
    • jae-jae/g-search-mcp
  • Didn't find tool you were looking for?

    Be as detailed as possible for better results