Agent skill

ocrmypdf

OCRmyPDF core skill — add searchable OCR text layer to scanned PDFs, convert images to searchable PDFs, support 100+ languages via Tesseract. Use when the user needs to OCR a PDF, make a scanned PDF searchable, or extract text from scanned documents.

View SKILL.md on GitHub Repository

Stars 254

Forks 41

Install this agent skill to your Project

npx add-skill https://github.com/partme-ai/full-stack-skills/tree/main/skills/ocrmypdf-skills/ocrmypdf

SKILL.md

OCRmyPDF — Core OCR Guide

Overview

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. It uses Tesseract OCR, supports 100+ languages, produces PDF/A by default, and distributes work across all CPU cores.

For image processing (deskew, rotate, clean), see the ocrmypdf-image skill. For optimization and PDF/A options, see ocrmypdf-optimize. For batch/Docker/scripting, see ocrmypdf-batch. For Python API and plugins, see ocrmypdf-api.

Installation

One-liner installs (recommended)

OS	Command
Debian / Ubuntu	`apt install ocrmypdf`
Fedora	`dnf install ocrmypdf tesseract-osd`
macOS (Homebrew)	`brew install ocrmypdf`
macOS (MacPorts)	`port install ocrmypdf`
FreeBSD	`pkg install py-ocrmypdf`
Snap	`snap install ocrmypdf`

pip install (latest version)

bash

# After installing system dependencies (Tesseract, Ghostscript)
pip install ocrmypdf

Verify

bash

ocrmypdf --version
ocrmypdf --help

Requirements

Python 3.11+
Tesseract 4.1.1+ (OCR engine)
Ghostscript 9.54+ or pypdfium2 (PDF rasterization)
Optional: jbig2enc (compression), pngquant (image optimization), unpaper (cleaning)

Quick Start

bash

# Basic OCR — input scanned PDF, output searchable PDF/A
ocrmypdf input.pdf output.pdf

# OCR an image file directly
ocrmypdf --image-dpi 300 scan.png output.pdf

# OCR in place (only overwrites on success)
ocrmypdf myfile.pdf myfile.pdf

Language Support

OCRmyPDF uses Tesseract language packs. Install them for your OS:

bash

# Debian / Ubuntu
apt-cache search tesseract-ocr          # List all language packs
apt install tesseract-ocr-chi-sim       # Chinese Simplified
apt install tesseract-ocr-fra           # French

# macOS (Homebrew)
brew install tesseract-lang             # All languages

# Fedora
dnf search tesseract-langpack
dnf install tesseract-langpack-ita      # Italian

Using languages

bash

# Single language
ocrmypdf -l fra document.pdf output.pdf

# Multiple languages
ocrmypdf -l eng+fra bilingual.pdf output.pdf

# Chinese Simplified + English
ocrmypdf -l chi_sim+eng chinese-doc.pdf output.pdf

Note: Use ISO 639-3 codes for language identifiers.

OCR Modes

Default mode (skip existing text)

bash

# Skip pages that already have text — only OCR pages without text
ocrmypdf input.pdf output.pdf

Force OCR (`--force-ocr` or `-m force`)

bash

# Rasterize and OCR all pages, even those with existing text
ocrmypdf --force-ocr input.pdf output.pdf
# v17+ short form:
ocrmypdf -m force input.pdf output.pdf

Redo OCR (`--redo-ocr` or `-m redo`)

bash

# Replace existing OCR without rasterizing (preserves quality)
ocrmypdf --redo-ocr input.pdf output.pdf
# v17+ short form:
ocrmypdf -m redo input.pdf output.pdf

Skip text (`--skip-text` or `-m skip`)

bash

# Skip pages with any text, only OCR blank/image pages
ocrmypdf --skip-text input.pdf output.pdf
# v17+ short form:
ocrmypdf -m skip input.pdf output.pdf

No OCR (image processing only)

bash

# Apply image processing / PDF/A conversion without OCR
ocrmypdf --ocr-engine none input.pdf output.pdf

Page Selection

bash

# OCR only specific pages
ocrmypdf --pages 1,3,5-10 input.pdf output.pdf

# OCR only the first page, minimal changes elsewhere
ocrmypdf --pages 1 --output-type pdf --optimize 0 input.pdf output.pdf

Output Types

bash

# PDF/A (default) — for archival
ocrmypdf --output-type pdfa input.pdf output.pdf

# Standard PDF
ocrmypdf --output-type pdf input.pdf output.pdf

# Auto (v17+) — speculative PDF/A, falls back to standard PDF
ocrmypdf --output-type auto input.pdf output.pdf

# No output PDF — only produce sidecar text
ocrmypdf --output-type none --sidecar text.txt input.pdf -

Sidecar Text File

bash

# Produce a companion text file with OCR text
ocrmypdf --sidecar output.txt input.pdf output.pdf

Metadata

bash

# Set output PDF metadata
ocrmypdf --title "My Document" --author "Author Name" --subject "Subject" input.pdf output.pdf

Parallel Processing

bash

# Use 4 CPU cores (default: all available)
ocrmypdf --jobs 4 input.pdf output.pdf

# Single-threaded
ocrmypdf --jobs 1 input.pdf output.pdf

Common Recipes

Make a scanned PDF searchable

bash

ocrmypdf scanned.pdf searchable.pdf

Convert image to searchable PDF

bash

ocrmypdf --image-dpi 300 scan.jpg output.pdf

OCR a multilingual document

bash

ocrmypdf -l eng+deu+fra multilingual.pdf output.pdf

Re-OCR with newer Tesseract

bash

ocrmypdf --redo-ocr old-ocr.pdf updated.pdf

Strip all text/OCR from a PDF

bash

ocrmypdf --ocr-engine none --force-ocr input.pdf stripped.pdf

Quick Reference

Task	Command
Basic OCR	`ocrmypdf input.pdf output.pdf`
Specify language	`ocrmypdf -l fra input.pdf output.pdf`
Multiple languages	`ocrmypdf -l eng+fra input.pdf output.pdf`
Force re-OCR all pages	`ocrmypdf --force-ocr input.pdf output.pdf`
Replace existing OCR	`ocrmypdf --redo-ocr input.pdf output.pdf`
Skip pages with text	`ocrmypdf --skip-text input.pdf output.pdf`
Specific pages only	`ocrmypdf --pages 1,3,5-10 input.pdf output.pdf`
Output standard PDF	`ocrmypdf --output-type pdf input.pdf output.pdf`
Extract text sidecar	`ocrmypdf --sidecar text.txt input.pdf output.pdf`
Image to PDF	`ocrmypdf --image-dpi 300 image.png output.pdf`
In-place OCR	`ocrmypdf myfile.pdf myfile.pdf`
Set metadata	`ocrmypdf --title "Title" input.pdf output.pdf`
Parallel jobs	`ocrmypdf --jobs 4 input.pdf output.pdf`

Troubleshooting

"Tesseract not found": Install Tesseract and ensure it's on PATH.
Poor OCR quality: Check language packs (-l), try --deskew (see ocrmypdf-image), or --oversample 300.
"Input file has text": Use --force-ocr, --redo-ocr, or --skip-text as appropriate.
Large output files: See ocrmypdf-optimize for --optimize levels and JBIG2.
Signed PDFs: Use --invalidate-digital-signatures to override (signatures will be invalidated).

References

Maintainer

partme-ai Core maintainer

Source details

Full Name: partme-ai/full-stack-skills
Branch: main
Path in repo: skills/ocrmypdf-skills/ocrmypdf
License: Other
Topics: claude-code agent-skills cursor skills codebuddy qoder

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

partme-ai/full-stack-skills

ocrmypdf-batch

OCRmyPDF batch processing skill — process multiple PDFs, Docker automation, shell scripting, and CI/CD integration. Use when the user needs to OCR many PDFs, set up automated OCR pipelines, or integrate OCR into workflows.

254 41

Explore

partme-ai/full-stack-skills

ocrmypdf-optimize

OCRmyPDF optimization skill — compress PDFs, configure PDF/A output, JBIG2 encoding, and lossless optimization. Use when the user needs to reduce PDF file size, create archival PDF/A files, or optimize OCR output.

254 41

Explore

partme-ai/full-stack-skills

ocrmypdf-image

OCRmyPDF image processing skill — deskew, rotate, clean, despeckle, remove border from scanned documents. Use when the user needs to improve scanned PDF quality, fix skewed pages, remove noise, or clean up scanned documents before OCR.

254 41

Explore

partme-ai/full-stack-skills

ocrmypdf-api

OCRmyPDF Python API and plugin skill — use OCRmyPDF programmatically from Python, integrate with applications, and extend with plugins (EasyOCR, PaddleOCR, AppleOCR). Use when the user needs to call OCRmyPDF from Python code, build OCR pipelines, or use alternative OCR engines.

254 41

Explore

partme-ai/full-stack-skills

svelte

Guides Svelte and SvelteKit development including reactive components, stores, transitions, lifecycle hooks, SSR, file-based routing, and deployment. Use when the user needs to build Svelte components, create SvelteKit applications, implement reactivity patterns, or configure Svelte with Vite.

254 41

Explore

partme-ai/full-stack-skills

tui-empty

Generate and render a pixel-precise ASCII TUI Empty State component with complete output blocks (TUI_RENDER, COMPONENT_SPEC, PENCIL_SPEC, PENCIL_BATCH_DESIGN) for Pencil MCP drawing workflows. Use when the user asks to create an empty state in a terminal UI, text-based interface, or Pencil MCP project.

254 41

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

OCRmyPDF — Core OCR Guide

Overview

Installation

One-liner installs (recommended)

pip install (latest version)

Verify

Requirements

Quick Start

Language Support

Using languages

OCR Modes

Default mode (skip existing text)

Force OCR (--force-ocr or -m force)

Redo OCR (--redo-ocr or -m redo)

Skip text (--skip-text or -m skip)

No OCR (image processing only)

Page Selection

Output Types

Sidecar Text File

Metadata

Parallel Processing

Common Recipes

Make a scanned PDF searchable

Convert image to searchable PDF

OCR a multilingual document

Re-OCR with newer Tesseract

Strip all text/OCR from a PDF

Quick Reference

Troubleshooting

References

Recommended Agent Skills

ocrmypdf-batch

ocrmypdf-optimize

ocrmypdf-image

ocrmypdf-api

svelte

tui-empty

Force OCR (`--force-ocr` or `-m force`)

Redo OCR (`--redo-ocr` or `-m redo`)

Skip text (`--skip-text` or `-m skip`)