Agent skill

extract-text-pdf

Extract text from PDF files using PyMuPDF. Use this skill when you need to read the contents of a PDF file, such as a resume, report, or manual, into plain text for analysis or processing.

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/extract-text-pdf

SKILL.md

Extract Text from PDF

Overview

This skill provides a reliable way to extract text from PDF files using the pymupdf library (also known as fitz). It correctly handles document structure and encoding better than many basic tools.

Prerequisites

This skill requires the pymupdf Python library.

bash

pip install pymupdf

Usage

Extract Text Script

The skill includes a Python script scripts/extract_pdf_text.py that extracts text from a PDF file.

Syntax:

bash

python3 .agent/skills/extract-text-pdf/scripts/extract_pdf_text.py <path_to_pdf> [--layout]

Arguments:

path_to_pdf: The absolute path to the PDF file you want to read.
--layout: (Optional) precise layout preservation. By default, the script extracts text in natural reading order.

Example:

bash

# Extract text from a resume
python3 .agent/skills/extract-text-pdf/scripts/extract_pdf_text.py /Users/user/documents/resume.pdf

# Capture output to a file
python3 .agent/skills/extract-text-pdf/scripts/extract_pdf_text.py /path/to/doc.pdf > extracted_text.txt

When to Use

Use this skill when:

You need to read the content of a PDF file.
You want to analyze text data from a PDF (e.g., parsing a resume).
Simple checks (cat, grep) won't work because the file is binary PDF format.

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/extract-text-pdf
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Extract Text from PDF

Overview

Prerequisites

Usage

Extract Text Script

When to Use

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state