Agent skill

markitdown

Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting text from PDFs/Office files, transcribing audio, performing OCR on images, extracting YouTube transcripts, or processing batches of files. Supports 20+ formats including DOCX, XLSX, PPTX, PDF, HTML, EPUB, CSV, JSON, images with OCR, and audio with transcription.

Stars 152
Forks 20

Install this agent skill to your Project

npx add-skill https://github.com/Microck/ordinary-claude-skills/tree/main/skills_all/claude-scientific-skills/scientific-skills/markitdown

SKILL.md

MarkItDown

Overview

MarkItDown is a Python utility that converts various file formats into Markdown format, optimized for use with large language models and text analysis pipelines. It preserves document structure (headings, lists, tables, hyperlinks) while producing clean, token-efficient Markdown output.

When to Use This Skill

Use this skill when users request:

  • Converting documents to Markdown format
  • Extracting text from PDF, Word, PowerPoint, or Excel files
  • Performing OCR on images to extract text
  • Transcribing audio files to text
  • Extracting YouTube video transcripts
  • Processing HTML, EPUB, or web content to Markdown
  • Converting structured data (CSV, JSON, XML) to readable Markdown
  • Batch converting multiple files or ZIP archives
  • Preparing documents for LLM analysis or RAG systems

Core Capabilities

1. Document Conversion

Convert Office documents and PDFs to Markdown while preserving structure.

Supported formats:

  • PDF files (with optional Azure Document Intelligence integration)
  • Word documents (DOCX)
  • PowerPoint presentations (PPTX)
  • Excel spreadsheets (XLSX, XLS)

Basic usage:

python
from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("document.pdf")
print(result.text_content)

Command-line:

bash
markitdown document.pdf -o output.md

See references/document_conversion.md for detailed documentation on document-specific features.

2. Media Processing

Extract text from images using OCR and transcribe audio files to text.

Supported formats:

  • Images (JPEG, PNG, GIF, etc.) with EXIF metadata extraction
  • Audio files with speech transcription (requires speech_recognition)

Image with OCR:

python
from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("image.jpg")
print(result.text_content)  # Includes EXIF metadata and OCR text

Audio transcription:

python
result = md.convert("audio.wav")
print(result.text_content)  # Transcribed speech

See references/media_processing.md for advanced media handling options.

3. Web Content Extraction

Convert web-based content and e-books to Markdown.

Supported formats:

  • HTML files and web pages
  • YouTube video transcripts (via URL)
  • EPUB books
  • RSS feeds

YouTube transcript:

python
from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("https://youtube.com/watch?v=VIDEO_ID")
print(result.text_content)

See references/web_content.md for web extraction details.

4. Structured Data Handling

Convert structured data formats to readable Markdown tables.

Supported formats:

  • CSV files
  • JSON files
  • XML files

CSV to Markdown table:

python
from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("data.csv")
print(result.text_content)  # Formatted as Markdown table

See references/structured_data.md for format-specific options.

5. Advanced Integrations

Enhance conversion quality with AI-powered features.

Azure Document Intelligence: For enhanced PDF processing with better table extraction and layout analysis:

python
from markitdown import MarkItDown

md = MarkItDown(docintel_endpoint="<endpoint>", docintel_key="<key>")
result = md.convert("complex.pdf")

LLM-Powered Image Descriptions: Generate detailed image descriptions using GPT-4o:

python
from markitdown import MarkItDown
from openai import OpenAI

client = OpenAI()
md = MarkItDown(llm_client=client, llm_model="gpt-4o")
result = md.convert("presentation.pptx")  # Images described with LLM

See references/advanced_integrations.md for integration details.

6. Batch Processing

Process multiple files or entire ZIP archives at once.

ZIP file processing:

python
from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("archive.zip")
print(result.text_content)  # All files converted and concatenated

Batch script: Use the provided batch processing script for directory conversion:

bash
python scripts/batch_convert.py /path/to/documents /path/to/output

See scripts/batch_convert.py for implementation details.

Installation

Full installation (all features):

bash
uv pip install 'markitdown[all]'

Modular installation (specific features):

bash
uv pip install 'markitdown[pdf]'           # PDF support
uv pip install 'markitdown[docx]'          # Word support
uv pip install 'markitdown[pptx]'          # PowerPoint support
uv pip install 'markitdown[xlsx]'          # Excel support
uv pip install 'markitdown[audio]'         # Audio transcription
uv pip install 'markitdown[youtube]'       # YouTube transcripts

Requirements:

  • Python 3.10 or higher

Output Format

MarkItDown produces clean, token-efficient Markdown optimized for LLM consumption:

  • Preserves headings, lists, and tables
  • Maintains hyperlinks and formatting
  • Includes metadata where relevant (EXIF, document properties)
  • No temporary files created (streaming approach)

Common Workflows

Preparing documents for RAG:

python
from markitdown import MarkItDown

md = MarkItDown()

# Convert knowledge base documents
docs = ["manual.pdf", "guide.docx", "faq.html"]
markdown_content = []

for doc in docs:
    result = md.convert(doc)
    markdown_content.append(result.text_content)

# Now ready for embedding and indexing

Document analysis pipeline:

bash
# Convert all PDFs in directory
for file in documents/*.pdf; do
    markitdown "$file" -o "markdown/$(basename "$file" .pdf).md"
done

Plugin System

MarkItDown supports extensible plugins for custom conversion logic. Plugins are disabled by default for security:

python
from markitdown import MarkItDown

# Enable plugins if needed
md = MarkItDown(enable_plugins=True)

Resources

This skill includes comprehensive reference documentation for each capability:

  • references/document_conversion.md - Detailed PDF, DOCX, PPTX, XLSX conversion options
  • references/media_processing.md - Image OCR and audio transcription details
  • references/web_content.md - HTML, YouTube, and EPUB extraction
  • references/structured_data.md - CSV, JSON, XML conversion formats
  • references/advanced_integrations.md - Azure Document Intelligence and LLM integration
  • scripts/batch_convert.py - Batch processing utility for directories

Expand your agent's capabilities with these related and highly-rated skills.

Microck/ordinary-claude-skills

nondominium-holochain-dna-dev

Specialized skill for nondominium Holochain DNA development, focusing on zome creation, entry patterns, integrity/coordinator architecture, ValueFlows compliance, and WASM optimization. Use when creating new zomes, implementing entry types, or modifying Holochain DNA code.

152 20
Explore
Microck/ordinary-claude-skills

fluidsim

Framework for computational fluid dynamics simulations using Python. Use when running fluid dynamics simulations including Navier-Stokes equations (2D/3D), shallow water equations, stratified flows, or when analyzing turbulence, vortex dynamics, or geophysical flows. Provides pseudospectral methods with FFT, HPC support, and comprehensive output analysis.

152 20
Explore
Microck/ordinary-claude-skills

metabolomics-workbench-database

Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.

152 20
Explore
Microck/ordinary-claude-skills

run-tests

Validate code changes by intelligently selecting and running the appropriate test suites. Use this when editing code to verify changes work correctly, run tests, validate functionality, or check for regressions. Automatically discovers affected test suites, selects the minimal set of venvs needed for validation, and handles test execution with Docker services as needed.

152 20
Explore
Microck/ordinary-claude-skills

skill-navigator

The 100th skill! Your intelligent guide to all 99 other skills. Recommends the perfect skill for any task, creates skill combinations, and helps you discover capabilities you didn't know you had.

152 20
Explore
Microck/ordinary-claude-skills

AgentDB Advanced Features

Master advanced AgentDB features including QUIC synchronization, multi-database management, custom distance metrics, hybrid search, and distributed systems integration. Use when building distributed AI systems, multi-agent coordination, or advanced vector search applications.

152 20
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results