RAG Documentation MCP Server

RAG Documentation MCP Server

Vector-based documentation search and context augmentation for AI assistants

238
Stars
29
Forks
238
Watchers
3
Issues
RAG Documentation MCP Server provides vector-based search and retrieval tools for documentation, enabling large language models to reference relevant context in their responses. It supports managing multiple documentation sources, semantic search, and real-time context delivery. Documentation can be indexed, searched, and managed with queueing and processing features, making it highly suitable for AI-driven assistants. Integration with Claude Desktop and support for Qdrant vector databases is also available.

Key Features

Vector-based semantic search
Documentation excerpt retrieval with context
Support for multiple documentation sources
Automated documentation processing and indexing
Documentation source management (listing, removing)
Real-time augmentation of LLM responses
Queue management for documentation processing
Integration with Qdrant vector database
Extraction and queuing of URLs for indexing
Permanent removal and clearing of documentation and queue

Use Cases

Enhancing virtual assistants with up-to-date documentation context
Building AI bots aware of technical or user documentation
Providing semantic search across multiple documentation sources
Supporting developer tools that surface explanatory content
Real-time augmentation of language model outputs with authoritative references
Automating indexing and updating of documentation for AI access
Monitoring and managing the ingestion pipeline for documentation
Rapid prototyping of documentation-aware chatbots
Integrating external web content into searchable knowledge bases
Customizing LLM tools for specialized documentation domains

README

RAG Documentation MCP Server

An MCP server implementation that provides tools for retrieving and processing documentation through vector search, enabling AI assistants to augment their responses with relevant documentation context.

Features

  • Vector-based documentation search and retrieval
  • Support for multiple documentation sources
  • Semantic search capabilities
  • Automated documentation processing
  • Real-time context augmentation for LLMs

Tools

search_documentation

Search through stored documentation using natural language queries. Returns matching excerpts with context, ranked by relevance.

Inputs:

  • query (string): The text to search for in the documentation. Can be a natural language query, specific terms, or code snippets.
  • limit (number, optional): Maximum number of results to return (1-20, default: 5). Higher limits provide more comprehensive results but may take longer to process.

list_sources

List all documentation sources currently stored in the system. Returns a comprehensive list of all indexed documentation including source URLs, titles, and last update times. Use this to understand what documentation is available for searching or to verify if specific sources have been indexed.

extract_urls

Extract and analyze all URLs from a given web page. This tool crawls the specified webpage, identifies all hyperlinks, and optionally adds them to the processing queue.

Inputs:

  • url (string): The complete URL of the webpage to analyze (must include protocol, e.g., https://). The page must be publicly accessible.
  • add_to_queue (boolean, optional): If true, automatically add extracted URLs to the processing queue for later indexing. Use with caution on large sites to avoid excessive queuing.

remove_documentation

Remove specific documentation sources from the system by their URLs. The removal is permanent and will affect future search results.

Inputs:

  • urls (string[]): Array of URLs to remove from the database. Each URL must exactly match the URL used when the documentation was added.

list_queue

List all URLs currently waiting in the documentation processing queue. Shows pending documentation sources that will be processed when run_queue is called. Use this to monitor queue status, verify URLs were added correctly, or check processing backlog.

run_queue

Process and index all URLs currently in the documentation queue. Each URL is processed sequentially, with proper error handling and retry logic. Progress updates are provided as processing occurs. Long-running operations will process until the queue is empty or an unrecoverable error occurs.

clear_queue

Remove all pending URLs from the documentation processing queue. Use this to reset the queue when you want to start fresh, remove unwanted URLs, or cancel pending processing. This operation is immediate and permanent - URLs will need to be re-added if you want to process them later.

Usage

The RAG Documentation tool is designed for:

  • Enhancing AI responses with relevant documentation
  • Building documentation-aware AI assistants
  • Creating context-aware tooling for developers
  • Implementing semantic documentation search
  • Augmenting existing knowledge bases

Configuration

Usage with Claude Desktop

Add this to your claude_desktop_config.json:

json
{
  "mcpServers": {
    "rag-docs": {
      "command": "npx",
      "args": [
        "-y",
        "@hannesrudolph/mcp-ragdocs"
      ],
      "env": {
        "OPENAI_API_KEY": "",
        "QDRANT_URL": "",
        "QDRANT_API_KEY": ""
      }
    }
  }
}

You'll need to provide values for the following environment variables:

  • OPENAI_API_KEY: Your OpenAI API key for embeddings generation
  • QDRANT_URL: URL of your Qdrant vector database instance
  • QDRANT_API_KEY: API key for authenticating with Qdrant

License

This MCP server is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License. For more details, please see the LICENSE file in the project repository.

Acknowledgments

This project is a fork of qpd-v/mcp-ragdocs, originally developed by qpd-v. The original project provided the foundation for this implementation.

Star History

Star History Chart

Repository Owner

Repository Details

Language TypeScript
Default Branch main
Size 2,886 KB
Contributors 4
License MIT License
MCP Verified Nov 12, 2025

Programming Languages

TypeScript
97.67%
JavaScript
2.33%

Tags

Topics

llm mcp mcp-servers rag vector-database

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

We respect your privacy. Unsubscribe at any time.

Related MCPs

Discover similar Model Context Protocol servers

  • MCP Local RAG

    MCP Local RAG

    Privacy-first local semantic document search server for MCP clients.

    MCP Local RAG is a privacy-preserving, local document search server designed for use with Model Context Protocol (MCP) clients such as Cursor, Codex, and Claude Code. It enables users to ingest and semantically search local documents without using external APIs or cloud services. All processing, including embedding generation and vector storage, is performed on the user's machine. The tool supports document ingestion, semantic search, file management, file deletion, and system status reporting through MCP.

    • 10
    • MCP
    • shinpr/mcp-local-rag
  • Biel.ai MCP Server

    Biel.ai MCP Server

    Seamlessly connect IDEs to your company’s product documentation using an MCP server.

    Biel.ai MCP Server enables AI tools such as Cursor, VS Code, and Claude Desktop to access and utilize a company’s product documentation and knowledge base through the Model Context Protocol. It provides a hosted RAG layer that makes documentation searchable and usable, supporting real-time, context-rich completion and answers for developers. The server can be used as a hosted solution or self-hosted locally or via Docker for advanced customization.

    • 2
    • MCP
    • TechDocsStudio/biel-mcp
  • Trieve

    Trieve

    All-in-one solution for search, recommendations, and RAG.

    Trieve offers a platform for semantic search, recommendations, and retrieval-augmented generation (RAG). It supports dense vector search, typo-tolerant neural search, sub-sentence highlighting, and integrates with a variety of embedding models. Trieve can be self-hosted and features APIs for context management with LLMs, including Bring Your Own Model and managed RAG endpoints. Full documentation and SDKs are available for streamlined integration.

    • 2,555
    • MCP
    • devflowinc/trieve
  • Driflyte MCP Server

    Driflyte MCP Server

    Bridging AI assistants with deep, topic-aware knowledge from web and code sources.

    Driflyte MCP Server acts as a bridge between AI-powered assistants and diverse, topic-aware content sources by exposing a Model Context Protocol (MCP) server. It enables retrieval-augmented generation workflows by crawling, indexing, and serving topic-specific documents from web pages and GitHub repositories. The system is extensible, with planned support for additional knowledge sources, and is designed for easy integration with popular AI tools such as ChatGPT, Claude, and VS Code.

    • 9
    • MCP
    • serkan-ozal/driflyte-mcp-server
  • mcp-local-rag

    mcp-local-rag

    Local RAG server for web search and context injection using Model Context Protocol.

    mcp-local-rag is a local server implementing the Model Context Protocol (MCP) to provide retrieval-augmented generation (RAG) capabilities. It performs live web search, extracts relevant context using Google's MediaPipe Text Embedder, and supplies the information to large language models (LLMs) for enhanced, up-to-date responses. The tool is designed for easy local deployment, requiring no external APIs, and is compatible with multiple MCP clients. Security audits are available, and integration is demonstrated across several LLM platforms.

    • 89
    • MCP
    • nkapila6/mcp-local-rag
  • Context7 MCP

    Context7 MCP

    Up-to-date code docs for every AI prompt.

    Context7 MCP delivers current, version-specific documentation and code examples directly into large language model prompts. By integrating with model workflows, it ensures responses are accurate and based on the latest source material, reducing outdated and hallucinated code. Users can fetch relevant API documentation and examples by simply adding a directive to their prompts. This allows for more reliable, context-rich answers tailored to real-world programming scenarios.

    • 36,881
    • MCP
    • upstash/context7
  • Didn't find tool you were looking for?

    Be as detailed as possible for better results