Agent skill
pdf-to-markdown
Convert PDF to clean Markdown with image content described as text. Use when user wants to convert a PDF to markdown, extract content from PDF, or prepare PDF content for AI tools.
Install this agent skill to your Project
npx add-skill https://github.com/krishagel/geoffrey/tree/main/skills/pdf-to-markdown
SKILL.md
PDF to Markdown Converter
Convert PDF files to clean, well-structured Markdown. Tables become markdown tables. Images and graphics are described as text (no image files generated).
Quick Start
uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py input.pdf
Output: ~/Desktop/{filename}.md
Options
| Flag | Description |
|---|---|
--no-llm |
Skip LLM processing (faster, images become [Image] placeholders) |
--force-ocr |
Force OCR on all pages (for scanned PDFs) |
--page-range "0,5-10" |
Process specific pages only |
Common Use Cases
Convert a PDF with default settings
uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py ~/Documents/report.pdf
Specify output location
uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py report.pdf ~/Documents/report.md
Fast conversion (no image descriptions)
uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py --no-llm report.pdf
Scanned PDF (force OCR)
uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py --force-ocr scanned_doc.pdf
Extract specific pages
uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py --page-range "0-5" large_report.pdf
Output
- Pure Markdown text (no embedded images)
- Tables converted to Markdown table format
- Images/charts described as text using LLM
- Clean formatting suitable for AI processing
Requirements
- GEMINI_API_KEY: Required for LLM image descriptions (loaded from 1Password)
- Use
--no-llmflag if you don't have Gemini API access
First Run Note
The first run downloads ML models (~1-2GB) which are cached at ~/.cache/marker/. Subsequent runs are faster.
Technical Details
Uses Marker library:
- 31k+ GitHub stars
- Best-in-class PDF conversion accuracy
- Surya OCR for 90+ languages
- Gemini LLM integration for image understanding
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
omnifocus-manager
Manage OmniFocus tasks, projects, and inbox with proper tagging and organization
writer
Generate content in your authentic voice across emails, blogs, social media, and reports
presentation-master
World-class presentation creation embodying principles from Garr Reynolds, Nancy Duarte, Guy Kawasaki, Seth Godin, and TED
cfo-briefing
Generate daily CFO briefing for Ashley Murphy covering absence stats, department tickets, legislative fiscal updates, and K-12 finance news. Delivered via HTML email with absence infographic and podcast attachment.
google-workspace
Unified Google Workspace integration for managing email, calendar, files, and communication across multiple accounts
drafts-manager
Triage Drafts inbox and route notes to OmniFocus tasks or Obsidian documents
Didn't find tool you were looking for?