Agent skill
pypdf
Manipulate PDF documents programmatically. Merge, split, rotate, and watermark PDFs. Extract text and metadata. Handle form filling and encryption/decryption.
Install this agent skill to your Project
npx add-skill https://github.com/vamseeachanta/workspace-hub/tree/main/.claude/skills/data/office/pypdf
SKILL.md
Pypdf
Overview
PyPDF is a pure-Python library for working with PDF files. This skill covers comprehensive patterns for PDF manipulation including:
- PDF merging - Combine multiple PDFs into one document
- PDF splitting - Extract specific pages or split into multiple files
- Page rotation - Rotate pages by 90, 180, or 270 degrees
- Watermarking - Add text or image watermarks to pages
- Text extraction - Extract text content from PDF pages
- Metadata handling - Read and modify PDF metadata
- Form filling - Fill PDF form fields programmatically
- Encryption/Decryption - Secure PDFs with passwords
When to Use This Skill
USE when:
- Merging multiple PDF files into a single document
- Splitting large PDFs into smaller files
- Extracting specific pages from PDFs
- Adding watermarks or stamps to documents
- Extracting text content for analysis
- Reading or modifying PDF metadata
- Filling PDF forms programmatically
- Encrypting or decrypting PDF files
- Adding page numbers or headers/footers
- Rotating or reordering pages
- Automating PDF workflows in pipelines
DON'T USE when:
- Creating PDFs from scratch (use reportlab or weasyprint)
- Need advanced text layout control (use reportlab)
- Converting other formats to PDF (use dedicated converters)
- Need OCR for scanned documents (use pytesseract + pdf2image)
- Working with complex form creation (use reportlab)
- Need to edit existing text content (limited support)
Prerequisites
Installation
# Basic installation
pip install pypdf
# Using uv (recommended)
uv pip install pypdf
# With crypto support for encryption
pip install pypdf[crypto]
*See sub-skills for full details.*
### Verify Installation
```python
from pypdf import PdfReader, PdfWriter, PdfMerger
from pypdf.errors import PdfReadError
print("pypdf installed successfully!")
print(f"Version: {pypdf.__version__}")
Version History
1.0.0 (2026-01-17)
- Initial skill creation
- Core capabilities documentation
- 6 complete code examples
- Batch processing patterns
- Encryption and form handling
Resources
- Official Documentation: https://pypdf.readthedocs.io/
- GitHub Repository: https://github.com/py-pdf/pypdf
- PyPI Package: https://pypi.org/project/pypdf/
- Migration from PyPDF2: https://pypdf.readthedocs.io/en/latest/migration.html
Related Skills
- reportlab - PDF creation from scratch
- python-docx - Word document handling
- pillow - Image processing for PDF images
- pdf2image - Convert PDF pages to images
This skill provides comprehensive patterns for PDF manipulation refined from production document processing systems.
Sub-Skills
- 1. PDF Merging
- 2. PDF Splitting
- 3. Page Rotation and Transformation
- 4. Watermarking and Stamping
- 5. Text Extraction and Metadata
- 6. Encryption and Form Filling
- Batch PDF Processing Pipeline
- 1. Memory Management (+2)
- Common Issues
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
gsd-complete-milestone
Archive completed milestone and prepare for next version
gsd-reapply-patches
Reapply local modifications after a GSD update
gsd-verify-work
Validate built features through conversational UAT
gsd-thread
Manage persistent context threads for cross-session work
clinical-trial-protocol
Generate clinical trial protocols for medical devices or drugs through a modular, waypoint-based architecture with research-only and full protocol modes.
single-cell-rna-qc
Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations.
Didn't find tool you were looking for?