mcp-pandoc

mcp-pandoc

Seamless document format conversion via the Model Context Protocol.

445
Stars
57
Forks
445
Watchers
4
Issues
mcp-pandoc is a Model Context Protocol (MCP) server for document format conversion powered by Pandoc. It enables bidirectional transformation of content between various document formats while preserving structure and formatting. Designed for integration in AI workflows, it allows standardized and programmatic conversion suitable for large language models and tool-augmented assistants. The project exposes conversion functionality both by direct content submission and via input files.

Key Features

Supports multiple document formats including Markdown, HTML, DOCX, and more
Bidirectional conversion matrix for flexible format interchange
MCP-compatible server for integration into model workflows
Direct content and file path input options
Preserves formatting and document structure during conversion
Early PDF support (under development)
Reference cheatsheet for quick usage and workflows
Automatable and standardized API for conversion tasks
Enables integration with AI assistants and toolchains
Comprehensive demo and usage examples

Use Cases

Automated document format conversion in AI pipelines
Integrating with AI assistants for file upload and processing
Programmatically transforming report formats for enterprise workflows
Standardizing input/output formats for LLM-powered tools
Preparing documents for publishing or presentation in different formats
Facilitating context conversion for LLM prompt engineering
Supporting chat-based document editing and conversion
Developing plugins for collaborative document platforms
Batch processing of diverse file types in a unified way
Prototyping workflow automations involving content transformations

README

Downloads CI

image

MseeP.ai Security Assessment Badge

mcp-pandoc: A Document Conversion MCP Server

Officially included in the Model Context Protocol servers open-source project. 🎉

Overview

A Model Context Protocol server for document format conversion using pandoc. This server provides tools to transform content between different document formats while preserving formatting and structure.

Please note that mcp-pandoc is currently in early development. PDF support is under development, and the functionality and available tools are subject to change and expansion as we continue to improve the server.

Credit: This project uses the Pandoc Python package for document conversion, forming the foundation for this project.

📋 Quick Reference

New to mcp-pandoc? Check out 📖 CHEATSHEET.md for

  • ⚡ Copy-paste examples for all formats
  • 🔄 Bidirectional conversion matrix
  • 🎯 Common workflows and pro tips
  • 🌟 Reference document styling guide

Perfect for quick lookups and getting started fast!

Demo

mcp-pandoc - v1: Seamless Document Format Conversion for Claude using MCP server

🎥 Watch on YouTube

More to come...

Tools

  1. convert-contents
    • Transforms content between supported formats
    • Inputs:
      • contents (string): Source content to convert (required if input_file not provided)
      • input_file (string): Complete path to input file (required if contents not provided)
      • input_format (string): Source format of the content (defaults to markdown)
      • output_format (string): Target format (defaults to markdown)
      • output_file (string): Complete path for output file (required for pdf, docx, rst, latex, epub formats)
      • reference_doc (string): Path to a reference document to use for styling (supported for docx output format)
      • defaults_file (string): Path to a Pandoc defaults file (YAML) containing conversion options
      • filters (array): List of Pandoc filter paths to apply during conversion
    • Supported input/output formats:
      • markdown
      • html
      • pdf
      • docx
      • rst
      • latex
      • epub
      • txt
      • ipynb
      • odt
    • Note: For advanced formats (pdf, docx, rst, latex, epub), an output_file path is required

🔧 Advanced Features

Defaults Files (YAML Configuration)

Use defaults files to create reusable conversion templates with consistent formatting:

yaml
# academic-paper.yaml
from: markdown
to: pdf
number-sections: true
toc: true
metadata:
  title: "Academic Paper"
  author: "Research Team"

Example usage: "Convert paper.md to PDF using defaults academic-paper.yaml and save as paper.pdf"

Pandoc Filters

Apply custom filters for enhanced processing:

Example usage: "Convert docs.md to HTML with filters ['/path/to/mermaid-filter.py'] and save as docs.html"

💡 For comprehensive examples and workflows, see CHEATSHEET.md

📊 Supported Formats & Conversions

Bidirectional Conversion Matrix

From\To MD HTML TXT DOCX PDF RST LaTeX EPUB IPYNB ODT
Markdown
HTML
TXT
DOCX
RST
LaTeX
EPUB
IPYNB
ODT

A Note on PDF Support

This tool uses pandoc for conversions, which allows for generating PDF files from the formats listed above. However, converting from a PDF to other formats is not supported. Therefore, PDF should be considered an output-only format.

Format Categories

Category Formats Requirements
Basic MD, HTML, TXT, IPYNB, ODT None
Advanced DOCX, PDF, RST, LaTeX, EPUB Must specify output_file path
Styled DOCX with reference doc Custom template support ⭐

Requirements by Format

  • PDF (.pdf) - requires TeX Live installation
  • DOCX (.docx) - supports custom styling via reference documents
  • All others - no additional requirements

Note: For advanced formats:

  1. Complete file paths with filename and extension are required
  2. PDF conversion requires TeX Live installation (see Critical Requirements section -> For macOS: brew install texlive)
  3. When no output path is specified:
    • Basic formats: Displays converted content in the chat
    • Advanced formats: May save in system temp directory (/tmp/ on Unix systems)

Usage & configuration

NOTE: Ensure to complete installing required packages mentioned below under "Critical Requirements".

To use the published one

bash
{
  "mcpServers": {
    "mcp-pandoc": {
      "command": "uvx",
      "args": ["mcp-pandoc"]
    }
  }
}

💡 Quick Start: See CHEATSHEET.md for copy-paste examples and common workflows.

⚠️ Important Notes

Critical Requirements

  1. Pandoc Installation
  • Required: Install pandoc - the core document conversion engine

  • Installation:

    bash
    # macOS
    brew install pandoc
    
    # Ubuntu/Debian
    sudo apt-get install pandoc
    
    # Windows
    # Download installer from: https://pandoc.org/installing.html
    
  • Verify: pandoc --version

  1. UV package installation
  • Required: Install uv package (includes uvx command)

  • Installation:

    bash
    # macOS
    brew install uv
    
    # Windows/Linux
    pip install uv
    
  • Verify: uvx --version

  1. PDF Conversion Prerequisites: Only needed if you need to convert & save pdf
  • TeX Live must be installed before attempting PDF conversion

  • Installation commands:

    bash
    # Ubuntu/Debian
    sudo apt-get install texlive-xetex
    
    # macOS
    brew install texlive
    
    # Windows
    # Install MiKTeX or TeX Live from:
    # https://miktex.org/ or https://tug.org/texlive/
    
  1. File Path Requirements
  • When saving or converting files, you MUST provide complete file paths including filename and extension
  • The tool does not automatically generate filenames or extensions

Examples

✅ Correct Usage:

bash
# Converting content to PDF
"Convert this text to PDF and save as /path/to/document.pdf"

# Converting between file formats
"Convert /path/to/input.md to PDF and save as /path/to/output.pdf"

# Converting to DOCX with a reference document template
"Convert input.md to DOCX using template.docx as reference and save as output.docx"

# Step-by-step reference document workflow
"First create a reference document: pandoc -o custom-reference.docx --print-default-data-file reference.docx" or if you already have one, use that
"Then convert with custom styling: Convert this text to DOCX using /path/to/custom-reference.docx as reference and save as /path/to/styled-output.docx"

❌ Incorrect Usage:

bash
# Missing filename and extension
"Save this as PDF in /documents/"

# Missing complete path
"Convert this to PDF"

# Missing extension
"Save as /documents/story"

Common Issues and Solutions

  1. PDF Conversion Fails

    • Error: "xelatex not found"
    • Solution: Install TeX Live first (see installation commands above)
  2. File Conversion Fails

    • Error: "Invalid file path"
    • Solution: Provide complete path including filename and extension
    • Example: /path/to/document.pdf instead of just /path/to/
  3. Format Conversion Fails

    • Error: "Unsupported format"
    • Solution: Use only supported formats:
      • Basic: txt, html, markdown
      • Advanced: pdf, docx, rst, latex, epub
  4. Reference Document Issues

    • Error: "Reference document not found"
    • Solution: Ensure the reference document path exists and is accessible
    • Note: Reference documents only work with DOCX output format
    • How to create: pandoc -o reference.docx --print-default-data-file reference.docx

Quickstart

Installing manually via claude_desktop_config.json config file

  • On MacOS: open ~/Library/Application\ Support/Claude/claude_desktop_config.json
  • On Windows: %APPDATA%/Claude/claude_desktop_config.json

a) Only for local development & contribution to this repo

ℹ️ Replace <DIRECTORY> with your locally cloned project path

bash
"mcpServers": {
  "mcp-pandoc": {
    "command": "uv",
    "args": [
      "--directory",
      "<DIRECTORY>/mcp-pandoc",
      "run",
      "mcp-pandoc"
    ]
  }
}

b) Published Servers Configuration - Consumers should use this config

bash
"mcpServers": {
  "mcp-pandoc": {
    "command": "uvx",
    "args": [
      "mcp-pandoc"
    ]
  }
}
  • If you face any issue, use the "Published Servers Configuration" above directly instead of this cli.

Note: To use locally configured mcp-pandoc, follow "Development/Unpublished Servers Configuration" step above.

Development

Testing

To run the comprehensive test suite and validate all supported bidirectional conversions, use the following command:

bash
uv run pytest tests/test_conversions.py

This ensures backward compatibility and verifies the tool's core functionality.

Building and Publishing

To prepare the package for distribution:

  1. Sync dependencies and update lockfile:
bash
uv sync
  1. Build package distributions:
bash
uv build

This will create source and wheel distributions in the dist/ directory.

  1. Publish to PyPI:
bash
uv publish

Note: You'll need to set PyPI credentials via environment variables or command flags:

  • Token: --token or UV_PUBLISH_TOKEN
  • Or username/password: --username/UV_PUBLISH_USERNAME and --password/UV_PUBLISH_PASSWORD

Debugging

Since MCP servers run over stdio, debugging can be challenging. For the best debugging experience, we strongly recommend using the MCP Inspector.

You can launch the MCP Inspector via npm with this command:

bash
npx @modelcontextprotocol/inspector uv --directory /Users/vivekvells/Desktop/code/ai/mcp-pandoc run mcp-pandoc

Upon launching, the Inspector will display a URL that you can access in your browser to begin debugging.


Contributing

We welcome contributions to enhance mcp-pandoc! Here's how you can get involved:

  1. Report Issues: Found a bug or have a feature request? Open an issue on our GitHub Issues page.
  2. Submit Pull Requests: Improve the codebase or add features by creating a pull request.

Star History

Star History Chart

Repository Owner

vivekVells
vivekVells

User

Repository Details

Language Python
Default Branch main
Size 1,850 KB
Contributors 8
License MIT License
MCP Verified Nov 12, 2025

Programming Languages

Python
96.09%
HTML
1.54%
Dockerfile
1.51%
Jupyter Notebook
0.63%
TeX
0.23%

Tags

Topics

pandoc pandoc-markdown pandoc-template

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

We respect your privacy. Unsubscribe at any time.

Related MCPs

Discover similar Model Context Protocol servers

  • MarkItDown

    MarkItDown

    Convert diverse files into Markdown for seamless LLM integration.

    MarkItDown is a lightweight Python utility for converting a wide range of file types—including PDF, Office documents, images, audio, websites, and more—into structured Markdown optimized for language models and text analysis tools. It includes an implementation of the Model Context Protocol (MCP) to facilitate integration with LLM applications, such as Claude Desktop. MarkItDown supports context-aware document conversions, prioritizing preservation of hierarchy and meaningful content, and can be used via CLI or as a library.

    • 82,918
    • MCP
    • microsoft/markitdown
  • Markdownify MCP Server

    Markdownify MCP Server

    Convert diverse files and web content into Markdown via the Model Context Protocol.

    Markdownify MCP Server offers a protocol-based server that transforms various file types—including PDF, images, audio, DOCX, XLSX, and PPTX—as well as web content like YouTube videos, Bing search results, and web pages into Markdown format. The server exposes a suite of conversion tools through a standardized interface for easy integration with applications. Optional configuration allows retrieval of Markdown files from restricted directories, and the platform supports development customization for additional tool integration. Deployment and operation are straightforward with cross-platform support (with pending Windows improvements).

    • 2,256
    • MCP
    • zcaceres/markdownify-mcp
  • PDF Tools MCP

    PDF Tools MCP

    Comprehensive PDF manipulation via MCP protocol.

    PDF Tools MCP provides an extensive suite of PDF manipulation operations using the Model Context Protocol framework. It supports both local and remote PDF tasks, such as rendering pages, merging, extracting metadata, retrieving text, and combining documents. The tool registers endpoints through the MCP protocol, enabling seamless server-based PDF processing for various clients. Built with Python, it emphasizes secure handling and compatibility with Claude Desktop via the Smithery ecosystem.

    • 31
    • MCP
    • danielkennedy1/pdf-tools-mcp
  • Typst MCP Server

    Typst MCP Server

    Facilitates AI-driven Typst interactions with LaTeX conversion, validation, and image generation tools.

    Typst MCP Server implements the Model Context Protocol, enabling AI models to interface seamlessly with Typst, a markup-based typesetting system. It provides tools for tasks such as converting LaTeX to Typst, validating Typst syntax, listing and retrieving Typst documentation chapters, and rendering Typst code as images. The server is compatible with MCP agent clients, such as Claude Desktop and VS Code’s agent mode. All functionalities are exposed as tools for ease of LLM integration.

    • 79
    • MCP
    • johannesbrandenburger/typst-mcp
  • Markmap MCP Server

    Markmap MCP Server

    Convert Markdown to interactive mind maps via the Model Context Protocol.

    Markmap MCP Server enables seamless conversion of Markdown content into interactive mind maps using the Model Context Protocol (MCP). It leverages the open-source markmap project and provides users with diverse export formats including PNG, JPG, and SVG. Designed for easy integration with MCP clients, it offers tools for automated browser previews, rich interactivity, and batch mind map generation. The server can be installed easily via npm or Smithery and supports configurable output directories.

    • 137
    • MCP
    • jinzcdev/markmap-mcp-server
  • Ebook-MCP

    Ebook-MCP

    A Model Context Protocol server for conversational e-book interaction and AI integration.

    Ebook-MCP acts as a Model Context Protocol (MCP) server enabling seamless interaction between large language model (LLM) applications and electronic books such as EPUB and PDF. It standardizes APIs for AI-powered reading, searching, and managing digital libraries. Through natural language interfaces, it provides smart library management, content navigation, and interactive learning within digital books. Ebook-MCP integrates with modern AI-powered IDEs and supports multi-format digital book processing.

    • 132
    • MCP
    • onebirdrocks/ebook-mcp
  • Didn't find tool you were looking for?

    Be as detailed as possible for better results