Agent skill
markitdown
Convert documents (PDF, Word, Excel, PowerPoint, images, HTML) to Markdown using microsoft/markitdown. Use for document analysis, content extraction, preprocessing for LLMs, or batch document conversion. Supports images with OCR/LLM descriptions, audio transcription, and ZIP archives.
Install this agent skill to your Project
npx add-skill https://github.com/rysweet/amplihack/tree/main/.claude/skills/markitdown
SKILL.md
Document to Markdown Conversion
Overview
Convert various document formats to clean Markdown using Microsoft's MarkItDown tool. Optimized for LLM processing, content extraction, and document analysis workflows.
Supported Formats: PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx/.xls), Images (with OCR/LLM), HTML, Audio (with transcription), CSV, JSON, XML, ZIP archives, EPubs
Quick Start
Basic Usage
from markitdown import MarkItDown
md = MarkItDown()
result = md.convert("document.pdf")
print(result.text_content)
Command Line
# Convert single file
markitdown document.pdf > output.md
markitdown document.pdf -o output.md
# Pipe input
cat document.pdf | markitdown
🔒 Security Considerations
Before using in production:
- ✅ Validate file types (MIME, not extension)
- ✅ Limit file sizes (prevent DoS)
- ✅ Sanitize file paths (prevent traversal)
- ✅ Protect API keys (never hardcode)
- ✅ Consider data privacy (external services)
See patterns.md for implementation details.
API Key Security
❌ NEVER:
- Hardcode keys in code
- Commit .env files to git
- Log environment variables
✅ ALWAYS:
- Use environment variables:
export OPENAI_API_KEY="sk-..."# pragma: allowlist secret - Use secret management (AWS Secrets Manager, Azure Key Vault)
- Rotate keys regularly
Common Patterns
PDF Documents
# Basic PDF conversion
md = MarkItDown()
result = md.convert("report.pdf")
# With Azure Document Intelligence (better quality)
md = MarkItDown(docintel_endpoint="<your-endpoint>")
result = md.convert("report.pdf")
Office Documents
# Word documents - preserves structure
result = md.convert("document.docx")
# Excel - converts tables to markdown tables
result = md.convert("spreadsheet.xlsx")
# PowerPoint - extracts slide content
result = md.convert("presentation.pptx")
Images with Descriptions
# ✅ SECURE: Using environment variables for API keys
import os
from openai import OpenAI
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
raise RuntimeError("OPENAI_API_KEY not set")
client = OpenAI(api_key=api_key)
md = MarkItDown(llm_client=client, llm_model="gpt-4o")
result = md.convert("diagram.jpg") # Gets AI-generated description
Batch Processing
from pathlib import Path
md = MarkItDown()
documents = Path(".").glob("*.pdf")
for doc in documents:
result = md.convert(str(doc))
output_path = doc.with_suffix(".md")
output_path.write_text(result.text_content)
Installation
# Full installation (all features)
pip install 'markitdown[all]'
# Selective features
pip install 'markitdown[pdf, docx, pptx]'
Requirements: Python 3.10 or higher
Key Features
- Structure Preservation: Maintains headings, lists, tables, links
- Plugin System: Extend with custom converters
- Docker Support: Containerized deployments
- MCP Integration: Model Context Protocol server for LLM apps
When to Read Supporting Files
-
reference.md - Read when you need:
- Complete API reference and all configuration options
- Azure Document Intelligence integration details
- Plugin development guide
- Docker and MCP server setup
- Troubleshooting and error handling
-
examples.md - Read when you need:
- Working examples for specific file types
- Batch processing workflows
- Error handling patterns
- Integration with existing pipelines
-
patterns.md - Read when you need:
- Production deployment patterns
- Performance optimization strategies
- Security considerations
- Anti-patterns to avoid
Quick Reference
| File Type | Use Case | Command |
|---|---|---|
| Reports, papers | md.convert("file.pdf") |
|
| Word | Documents | md.convert("file.docx") |
| Excel | Data tables | md.convert("file.xlsx") |
| PowerPoint | Presentations | md.convert("file.pptx") |
| Images | Diagrams with OCR | md = MarkItDown(llm_client=client); md.convert("img.jpg") |
| HTML | Web pages | md.convert("page.html") |
| ZIP | Archives | md.convert("archive.zip") - processes contents |
⚠️ Common Mistakes to Avoid
Anti-Pattern 1: Hardcoded API Keys
# ❌ NEVER DO THIS
md = MarkItDown(llm_client=OpenAI(api_key="sk-hardcoded-key"))
# ✅ ALWAYS DO THIS
api_key = os.getenv("OPENAI_API_KEY")
md = MarkItDown(llm_client=OpenAI(api_key=api_key))
Anti-Pattern 2: Unvalidated File Paths
# ❌ Vulnerable to path traversal
user_input = "../../../etc/passwd"
md.convert(user_input)
# ✅ Validate and sanitize
from pathlib import Path
safe_path = Path(user_input).resolve()
if not safe_path.is_relative_to(allowed_dir):
raise ValueError("Invalid path")
md.convert(str(safe_path))
Anti-Pattern 3: Ignoring File Size Limits
# ❌ Can cause DoS
md.convert("huge_file.pdf") # No size check
# ✅ Check size first
max_size = 50 * 1024 * 1024 # 50MB
if Path("file.pdf").stat().st_size > max_size:
raise ValueError("File too large")
Common Issues
Import Error: Ensure Python >= 3.10 and markitdown installed
Missing Dependencies: Install with pip install 'markitdown[all]'
Image Descriptions Not Working: Requires LLM client (OpenAI or compatible)
For detailed troubleshooting, see reference.md.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
chemist-analyst
Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.
learning-path-builder
Creates personalized learning paths for technologies, frameworks, or concepts. Use for user-interactive session only for onboarding new technologies, hackathon skill-building, or personal development planning. Not for use in automated development or investigation. Sequences resources (docs, tutorials, exercises) based on current skill level and learning goals. Adapts to learning style: hands-on, theory-first, project-based.
gh-work-report
Generates comprehensive GitHub activity reports across all authenticated accounts. Gathers repos, PRs, features, and themes for configurable time periods (1/5/7/30/90 days). Produces shareable markdown with tables, mermaid charts, and executive summaries. Can create a private repo with GitHub Actions automation and GitHub Pages aggregation site. Use when: "github report", "work report", "activity summary", "what did I work on", "gh-work-report", "show my github activity".
pr-review-assistant
Philosophy-aware PR reviews checking alignment with amplihack principles. Use when reviewing PRs to ensure ruthless simplicity, modular design, and zero-BS implementation. Suggests simplifications, identifies over-engineering, verifies brick module structure. Posts detailed, constructive review comments with specific file:line references.
code-smell-detector
Identifies anti-patterns specific to amplihack philosophy. Use when reviewing code for quality issues or refactoring. Detects: over-abstraction, complex inheritance, large functions (>50 lines), tight coupling, missing __all__ exports. Provides specific fixes and explanations for each smell.
biologist-analyst
Analyzes living systems and biological phenomena through biological lens using evolution, molecular biology, ecology, and systems biology frameworks. Provides insights on mechanisms, adaptations, interactions, and life processes. Use when: Biological systems, health issues, evolutionary questions, ecological problems, biotechnology. Evaluates: Function, structure, heredity, evolution, interactions, molecular mechanisms.
Didn't find tool you were looking for?