Agent skill

ocrmypdf-api

OCRmyPDF Python API and plugin skill — use OCRmyPDF programmatically from Python, integrate with applications, and extend with plugins (EasyOCR, PaddleOCR, AppleOCR). Use when the user needs to call OCRmyPDF from Python code, build OCR pipelines, or use alternative OCR engines.

View SKILL.md on GitHub Repository

Stars 254

Forks 41

Install this agent skill to your Project

npx add-skill https://github.com/partme-ai/full-stack-skills/tree/main/skills/ocrmypdf-skills/ocrmypdf-api

SKILL.md

OCRmyPDF — Python API & Plugins Guide

Overview

OCRmyPDF provides a Python API for programmatic use and a plugin interface for extending or replacing OCR engines. This skill covers the Python API, integration patterns, and the plugin ecosystem.

For CLI usage, see the ocrmypdf skill. For batch scripting, see ocrmypdf-batch.

Python API

Basic usage

python

import ocrmypdf

# Basic OCR
exit_code = ocrmypdf.ocr('input.pdf', 'output.pdf')

# With options
exit_code = ocrmypdf.ocr(
    'input.pdf',
    'output.pdf',
    language='eng+fra',
    deskew=True,
    rotate_pages=True,
    skip_text=True,
    optimize=2,
    jobs=4,
)

Return codes

python

import ocrmypdf

result = ocrmypdf.ocr('input.pdf', 'output.pdf')

if result == ocrmypdf.ExitCode.ok:
    print("OCR completed successfully")
elif result == ocrmypdf.ExitCode.already_done_ocr:
    print("PDF already has OCR text")
elif result == ocrmypdf.ExitCode.input_file:
    print("Input file issue")
else:
    print(f"Error: {result}")

Common API parameters

Parameter	Type	Description
`language`	str	Tesseract language(s), e.g. `'eng+fra'`
`deskew`	bool	Straighten crooked pages
`rotate_pages`	bool	Auto-rotate pages
`skip_text`	bool	Skip pages that already have text
`force_ocr`	bool	Force OCR on all pages
`redo_ocr`	bool	Replace existing OCR
`optimize`	int	Optimization level (0-3)
`output_type`	str	`'pdfa'`, `'pdf'`, `'auto'`, `'none'`
`jobs`	int	Number of parallel workers
`sidecar`	str	Path for sidecar text file
`image_dpi`	int	DPI for image inputs
`clean`	bool	Clean pages with unpaper (OCR only)
`clean_final`	bool	Clean pages and use in output
`remove_background`	bool	Remove noisy backgrounds
`oversample`	int	Oversample DPI for low-res images
`pages`	str	Page range, e.g. `'1,3,5-10'`
`title`	str	Output PDF title
`author`	str	Output PDF author

Integration example: Flask web service

python

from flask import Flask, request, send_file
import ocrmypdf
import tempfile
import os

app = Flask(__name__)

@app.route('/ocr', methods=['POST'])
def ocr_endpoint():
    """OCR a PDF via HTTP POST."""
    if 'file' not in request.files:
        return {'error': 'No file uploaded'}, 400

    uploaded = request.files['file']
    with tempfile.NamedTemporaryFile(suffix='.pdf', delete=False) as inp:
        uploaded.save(inp.name)
        out_path = inp.name.replace('.pdf', '_ocr.pdf')

    try:
        result = ocrmypdf.ocr(
            inp.name, out_path,
            language='eng',
            skip_text=True,
            optimize=2,
        )
        if result == ocrmypdf.ExitCode.ok:
            return send_file(out_path, as_attachment=True,
                             download_name='ocr_output.pdf')
        return {'error': f'OCR failed: {result}'}, 500
    finally:
        os.unlink(inp.name)
        if os.path.exists(out_path):
            os.unlink(out_path)

if __name__ == '__main__':
    app.run(port=5000)

Streamlit web UI

OCRmyPDF provides an optional Streamlit-based web UI:

bash

pip install ocrmypdf[webservice]
# See OCRmyPDF docs for launching the web service

Plugin Ecosystem

OCRmyPDF's plugin interface allows replacing the OCR engine. Available plugins:

OCRmyPDF-EasyOCR

Replaces Tesseract with EasyOCR (PyTorch-based). GPU strongly recommended.

bash

pip install ocrmypdf-easyocr

# Usage
ocrmypdf --plugin ocrmypdf_easyocr -l en input.pdf output.pdf

OCRmyPDF-PaddleOCR

Replaces Tesseract with PaddleOCR. Powerful GPU-accelerated engine.

bash

pip install ocrmypdf-paddleocr

# Usage
ocrmypdf --plugin ocrmypdf_paddleocr input.pdf output.pdf

OCRmyPDF-AppleOCR

Replaces Tesseract with Apple Vision Framework. macOS only.

bash

pip install ocrmypdf-appleocr

# Usage
ocrmypdf --plugin ocrmypdf_appleocr input.pdf output.pdf

paperless-ngx Integration

paperless-ngx uses OCRmyPDF internally for searchable document management. See paperless-ngx docs for configuration.

Custom Plugins

Create a custom OCR plugin by implementing the OCRmyPDF plugin interface:

python

# my_ocr_plugin.py
from ocrmypdf import OcrEngine, hookimpl

class MyOcrEngine(OcrEngine):
    """Custom OCR engine implementation."""

    @staticmethod
    def version():
        return "1.0.0"

    @staticmethod
    def creator_tag(options):
        return "MyOCR"

    def recognize(self, input_file, output_file, output_text, options):
        # Implement OCR logic here
        pass

@hookimpl
def get_ocr_engine():
    return MyOcrEngine()

bash

# Use custom plugin
ocrmypdf --plugin my_ocr_plugin input.pdf output.pdf

Quick Reference

Task	Code / Command
Python API basic	`ocrmypdf.ocr('in.pdf', 'out.pdf')`
With options	`ocrmypdf.ocr('in.pdf', 'out.pdf', language='eng', deskew=True)`
Check result	`if result == ocrmypdf.ExitCode.ok: ...`
EasyOCR plugin	`ocrmypdf --plugin ocrmypdf_easyocr in.pdf out.pdf`
PaddleOCR plugin	`ocrmypdf --plugin ocrmypdf_paddleocr in.pdf out.pdf`
AppleOCR plugin	`ocrmypdf --plugin ocrmypdf_appleocr in.pdf out.pdf`

Troubleshooting

Import error: Ensure pip install ocrmypdf in your Python environment.
Plugin not found: Check plugin is installed (pip install ocrmypdf-easyocr).
GPU not used (EasyOCR/PaddleOCR): Ensure CUDA/GPU drivers are installed.
Memory issues: Use jobs=1 for large files; process in batches.

References

Maintainer

partme-ai Core maintainer

Source details

Full Name: partme-ai/full-stack-skills
Branch: main
Path in repo: skills/ocrmypdf-skills/ocrmypdf-api
License: Other
Topics: claude-code agent-skills cursor skills codebuddy qoder

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

partme-ai/full-stack-skills

ocrmypdf-batch

OCRmyPDF batch processing skill — process multiple PDFs, Docker automation, shell scripting, and CI/CD integration. Use when the user needs to OCR many PDFs, set up automated OCR pipelines, or integrate OCR into workflows.

254 41

Explore

partme-ai/full-stack-skills

ocrmypdf-optimize

OCRmyPDF optimization skill — compress PDFs, configure PDF/A output, JBIG2 encoding, and lossless optimization. Use when the user needs to reduce PDF file size, create archival PDF/A files, or optimize OCR output.

254 41

Explore

partme-ai/full-stack-skills

ocrmypdf-image

OCRmyPDF image processing skill — deskew, rotate, clean, despeckle, remove border from scanned documents. Use when the user needs to improve scanned PDF quality, fix skewed pages, remove noise, or clean up scanned documents before OCR.

254 41

Explore

partme-ai/full-stack-skills

ocrmypdf

OCRmyPDF core skill — add searchable OCR text layer to scanned PDFs, convert images to searchable PDFs, support 100+ languages via Tesseract. Use when the user needs to OCR a PDF, make a scanned PDF searchable, or extract text from scanned documents.

254 41

Explore

partme-ai/full-stack-skills

svelte

Guides Svelte and SvelteKit development including reactive components, stores, transitions, lifecycle hooks, SSR, file-based routing, and deployment. Use when the user needs to build Svelte components, create SvelteKit applications, implement reactivity patterns, or configure Svelte with Vite.

254 41

Explore

partme-ai/full-stack-skills

tui-empty

Generate and render a pixel-precise ASCII TUI Empty State component with complete output blocks (TUI_RENDER, COMPONENT_SPEC, PENCIL_SPEC, PENCIL_BATCH_DESIGN) for Pencil MCP drawing workflows. Use when the user asks to create an empty state in a terminal UI, text-based interface, or Pencil MCP project.

254 41

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

OCRmyPDF — Python API & Plugins Guide

Overview

Python API

Basic usage

Return codes

Common API parameters

Integration example: Flask web service

Streamlit web UI

Plugin Ecosystem

OCRmyPDF-EasyOCR

OCRmyPDF-PaddleOCR

OCRmyPDF-AppleOCR

paperless-ngx Integration

Custom Plugins

Quick Reference

Troubleshooting

References

Recommended Agent Skills

ocrmypdf-batch

ocrmypdf-optimize

ocrmypdf-image

ocrmypdf

svelte

tui-empty