Kaggle MCP Server

Kaggle MCP Server

Model Context Protocol server enabling Kaggle dataset search and download tools.

28
Stars
7
Forks
28
Watchers
1
Issues
Kaggle MCP Server implements the Model Context Protocol (MCP) using the fastmcp library and provides tools for searching and downloading datasets from Kaggle via a standardized MCP interface. It manages Kaggle API authentication, exposes search and download tools as MCP resources, and offers prompts for generating exploratory data analysis notebooks. The server can be run locally or via Docker, supporting easy integration with MCP clients and compliant applications.

Key Features

Implements Model Context Protocol (MCP) server
Search for Kaggle datasets with custom queries
Download and unzip Kaggle datasets programmatically
Integration with Kaggle API through environment variables or kaggle.json
Provides MCP-compliant resources, tools, and prompts
EDA (Exploratory Data Analysis) notebook generation prompt
Docker support for easy deployment
Python-based, built on fastmcp
Supports virtual environments and modern dependency management
Automatic resource registration on server start

Use Cases

Automating dataset discovery for machine learning projects
Downloading and managing datasets for data science workflows
Integrating Kaggle data access into larger MCP workflows
Generating EDA notebooks programmatically via an MCP interface
Batch downloading and preparing datasets for model training
Serving as a backend for tools that require standardized dataset access
Streamlining Kaggle data acquisition in cloud or containerized environments
Building custom data pipelines with MCP client compatibility
Supporting educational projects requiring reproducible dataset access
Unifying dataset search, download, and analysis in a single protocol-driven server

README

smithery badge

Kaggle MCP (Model Context Protocol) Server

This repository contains an MCP (Model Context Protocol) server (server.py) built using the fastmcp library. It interacts with the Kaggle API to provide tools for searching and downloading datasets, and a prompt for generating EDA notebooks.

Project Structure

  • server.py: The FastMCP server application. It defines resources, tools, and prompts for interacting with Kaggle.
  • .env.example: An example file for environment variables (Kaggle API credentials). Rename to .env and fill in your details.
  • requirements.txt: Lists the necessary Python packages.
  • pyproject.toml & uv.lock: Project metadata and locked dependencies for uv package manager.
  • datasets/: Default directory where downloaded Kaggle datasets will be stored.

Setup

  1. Clone the repository:

    bash
    git clone <repository-url>
    cd <repository-directory>
    
  2. Create a virtual environment (recommended):

    bash
    python -m venv venv
    source venv/bin/activate  # On Windows use `venv\Scripts\activate`
    # Or use uv: uv venv
    
  3. Install dependencies: Using pip:

    bash
    pip install -r requirements.txt
    

    Or using uv:

    bash
    uv sync
    
  4. Set up Kaggle API credentials:

    • Method 1 (Recommended): Environment Variables
      • Create .env file
      • Open the .env file and add your Kaggle username and API key:
        dotenv
        KAGGLE_USERNAME=your_kaggle_username
        KAGGLE_KEY=your_kaggle_api_key
        
      • You can obtain your API key from your Kaggle account page (Account > API > Create New API Token). This will download a kaggle.json file containing your username and key.
    • Method 2: kaggle.json file
      • Download your kaggle.json file from your Kaggle account.
      • Place the kaggle.json file in the expected location (usually ~/.kaggle/kaggle.json on Linux/macOS or C:\Users\<Your User Name>\.kaggle\kaggle.json on Windows). The kaggle library will automatically detect this file if the environment variables are not set.

Running the Server

  1. Ensure your virtual environment is active.
  2. Run the MCP server:
    bash
    uv run kaggle-mcp
    
    The server will start and register its resources, tools, and prompts. You can interact with it using an MCP client or compatible tools.

Running the Docker Container

1. Set up Kaggle API credentials

This project requires Kaggle API credentials to access Kaggle datasets.

  • Go to https://www.kaggle.com/settings and click "Create New API Token" to download your kaggle.json file.
  • Open the kaggle.json file and copy your username and key into a new .env file in the project root:
KAGGLE_USERNAME=your_username
KAGGLE_KEY=your_key

2. Build the Docker image

sh
docker build -t kaggle-mcp-test .

3. Run the Docker container using your .env file

sh
docker run --rm -it --env-file .env kaggle-mcp-test

This will automatically load your Kaggle credentials as environment variables inside the container.


Server Features

The server exposes the following capabilities through the Model Context Protocol:

Tools

  • search_kaggle_datasets(query: str):
    • Searches for datasets on Kaggle matching the provided query string.
    • Returns a JSON list of the top 10 matching datasets with details like reference, title, download count, and last updated date.
  • download_kaggle_dataset(dataset_ref: str, download_path: str | None = None):
    • Downloads and unzips files for a specific Kaggle dataset.
    • dataset_ref: The dataset identifier in the format username/dataset-slug (e.g., kaggle/titanic).
    • download_path (Optional): Specifies where to download the dataset. If omitted, it defaults to ./datasets/<dataset_slug>/ relative to the server script's location.

Prompts

  • generate_eda_notebook(dataset_ref: str):
    • Generates a prompt message suitable for an AI model (like Gemini) to create a basic Exploratory Data Analysis (EDA) notebook for the specified Kaggle dataset reference.
    • The prompt asks for Python code covering data loading, missing value checks, visualizations, and basic statistics.

Connecting to Claude Desktop

Go to Claude > Settings > Developer > Edit Config > claude_desktop_config.json to include the following:

{
  "mcpServers": {
    "kaggle-mcp": {
      "command": "kaggle-mcp",
      "cwd": "<path-to-their-cloned-repo>/kaggle-mcp"
    }
  }
}

Usage Example

An AI agent or MCP client could interact with this server like this:

  1. Agent: "Search Kaggle for datasets about 'heart disease'"
    • Server executes search_kaggle_datasets(query='heart disease')
  2. Agent: "Download the dataset 'user/heart-disease-dataset'"
    • Server executes download_kaggle_dataset(dataset_ref='user/heart-disease-dataset')
  3. Agent: "Generate an EDA notebook prompt for 'user/heart-disease-dataset'"
    • Server executes generate_eda_notebook(dataset_ref='user/heart-disease-dataset')
    • Server returns a structured prompt message.
  4. Agent: (Sends the prompt to a code-generating model) -> Receives EDA Python code.

Star History

Star History Chart

Repository Owner

arrismo
arrismo

User

Repository Details

Language Python
Default Branch main
Size 78 KB
Contributors 1
License MIT License
MCP Verified Nov 12, 2025

Programming Languages

Python
87.99%
Dockerfile
12.01%

Tags

Topics

kaggle mcp-server

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

We respect your privacy. Unsubscribe at any time.

Related MCPs

Discover similar Model Context Protocol servers

  • MCP Server for Data Exploration

    MCP Server for Data Exploration

    Interactive Data Exploration and Analysis via Model Context Protocol

    MCP Server for Data Exploration enables users to interactively explore and analyze complex datasets using prompt templates and tools within the Model Context Protocol ecosystem. Designed as a personal Data Scientist assistant, it facilitates the conversion of raw data into actionable insights without manual intervention. Users can load CSV datasets, run Python scripts, and generate tailored reports and visualizations through an AI-powered interface. The server integrates directly with Claude Desktop, supporting rapid setup and seamless usage for both macOS and Windows.

    • 503
    • MCP
    • reading-plus-ai/mcp-server-data-exploration
  • Graphlit MCP Server

    Graphlit MCP Server

    Integrate and unify knowledge sources for RAG-ready AI context with the Graphlit MCP Server.

    Graphlit MCP Server provides a Model Context Protocol interface, enabling seamless integration between MCP clients and the Graphlit platform. It supports ingestion from a wide array of sources such as Slack, Discord, Google Drive, email, Jira, and GitHub, turning them into a searchable, RAG-ready knowledge base. Built-in tools allow for document, media extraction, web crawling, and web search, as well as advanced retrieval and publishing functionalities. The server facilitates easy configuration, sophisticated data operations, and automated notifications for diverse workflows.

    • 369
    • MCP
    • graphlit/graphlit-mcp-server
  • Unsplash MCP Server

    Unsplash MCP Server

    Seamless Unsplash image integration via the Model Context Protocol.

    Unsplash MCP Server provides a simple and robust interface to search and integrate high-quality Unsplash images through the Model Context Protocol (MCP). It offers advanced photo search capabilities with filters for keywords, color schemes, orientation, and sorting. Designed for easy integration with development environments such as Cursor and Smithery, it simplifies embedding Unsplash image search into AI and automation workflows.

    • 186
    • MCP
    • hellokaton/unsplash-mcp-server
  • DuckDuckGo Search MCP Server

    DuckDuckGo Search MCP Server

    A Model Context Protocol server for DuckDuckGo web search and intelligent content retrieval.

    DuckDuckGo Search MCP Server provides web search capabilities through DuckDuckGo, with advanced content fetching and parsing tailored for large language models. It supports rate limiting, error handling, and delivers results in an LLM-friendly format. The server is designed for seamless integration with AI applications and tools like Claude Desktop, enabling enhanced web search and content extraction through the Model Context Protocol.

    • 637
    • MCP
    • nickclyde/duckduckgo-mcp-server
  • tavily-search MCP server

    tavily-search MCP server

    A search server that integrates Tavily API with Model Context Protocol tools.

    tavily-search MCP server provides an MCP-compliant server to perform search queries using the Tavily API. It returns search results in text format, including AI responses, URLs, and result titles. The server is designed for easy integration with clients like Claude Desktop or Cursor and supports both local and Docker-based deployment. It facilitates AI workflows by offering search functionality as part of a standardized protocol interface.

    • 44
    • MCP
    • Tomatio13/mcp-server-tavily
  • mcp-local-rag

    mcp-local-rag

    Local RAG server for web search and context injection using Model Context Protocol.

    mcp-local-rag is a local server implementing the Model Context Protocol (MCP) to provide retrieval-augmented generation (RAG) capabilities. It performs live web search, extracts relevant context using Google's MediaPipe Text Embedder, and supplies the information to large language models (LLMs) for enhanced, up-to-date responses. The tool is designed for easy local deployment, requiring no external APIs, and is compatible with multiple MCP clients. Security audits are available, and integration is demonstrated across several LLM platforms.

    • 89
    • MCP
    • nkapila6/mcp-local-rag
  • Didn't find tool you were looking for?

    Be as detailed as possible for better results