Agent skill
working-with-documents
Creates and edits Office documents: Word (.docx), PDF, and PowerPoint (.pptx). Use when working with document creation, PDF manipulation, presentation generation, tracked changes, or converting between formats.
Install this agent skill to your Project
npx add-skill https://github.com/aiskillstore/marketplace/tree/main/skills/asmayaseen/working-with-documents
SKILL.md
Working with Documents
Quick Reference
| Format | Read | Create | Edit |
|---|---|---|---|
| DOCX | pandoc, python-docx | docx-js | OOXML (unpack/edit/pack) |
| pdfplumber, pypdf | reportlab | pypdf (merge/split) | |
| PPTX | markitdown | html2pptx | OOXML (unpack/edit/pack) |
Word Documents (.docx)
Reading Content
# Convert to markdown (preserves structure)
pandoc document.docx -o output.md
# With tracked changes visible
pandoc --track-changes=all document.docx -o output.md
Creating New Documents
Use docx-js (JavaScript):
const { Document, Packer, Paragraph, TextRun } = require('docx');
const doc = new Document({
sections: [{
children: [
new Paragraph({
children: [
new TextRun({ text: "Hello World", bold: true }),
],
}),
],
}],
});
Packer.toBuffer(doc).then(buffer => {
fs.writeFileSync("output.docx", buffer);
});
Editing Existing Documents (Tracked Changes)
# 1. Unpack
python ooxml/scripts/unpack.py document.docx unpacked/
# 2. Edit XML files in unpacked/word/document.xml
# Key files:
# - word/document.xml (main content)
# - word/comments.xml (comments)
# - word/media/ (images)
# 3. Pack
python ooxml/scripts/pack.py unpacked/ edited.docx
Tracked changes XML pattern:
<!-- Deletion -->
<w:del><w:r><w:delText>old text</w:delText></w:r></w:del>
<!-- Insertion -->
<w:ins><w:r><w:t>new text</w:t></w:r></w:ins>
PDF Documents
Reading PDFs
import pdfplumber
# Extract text
with pdfplumber.open("document.pdf") as pdf:
for page in pdf.pages:
print(page.extract_text())
# Extract tables
with pdfplumber.open("document.pdf") as pdf:
for page in pdf.pages:
tables = page.extract_tables()
for table in tables:
for row in table:
print(row)
Creating PDFs
from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph
from reportlab.lib.styles import getSampleStyleSheet
doc = SimpleDocTemplate("output.pdf", pagesize=letter)
styles = getSampleStyleSheet()
story = [
Paragraph("Report Title", styles['Title']),
Paragraph("Body text goes here.", styles['Normal']),
]
doc.build(story)
Merging/Splitting PDFs
from pypdf import PdfReader, PdfWriter
# Merge
writer = PdfWriter()
for pdf_file in ["doc1.pdf", "doc2.pdf"]:
reader = PdfReader(pdf_file)
for page in reader.pages:
writer.add_page(page)
writer.write(open("merged.pdf", "wb"))
# Split
reader = PdfReader("input.pdf")
for i, page in enumerate(reader.pages):
writer = PdfWriter()
writer.add_page(page)
writer.write(open(f"page_{i+1}.pdf", "wb"))
Command-Line Tools
# Extract text
pdftotext input.pdf output.txt
pdftotext -layout input.pdf output.txt # Preserve layout
# Merge with qpdf
qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf
# Split pages
qpdf input.pdf --pages . 1-5 -- pages1-5.pdf
PowerPoint Presentations (.pptx)
Reading Content
# Convert to markdown
python -m markitdown presentation.pptx
Creating New Presentations
Use html2pptx workflow:
- Create HTML slides (720pt × 405pt for 16:9)
- Convert with html2pptx.js library
- Validate with thumbnail grid
# Create thumbnails for validation
python scripts/thumbnail.py output.pptx --cols 4
Editing Existing Presentations
# 1. Unpack
python ooxml/scripts/unpack.py presentation.pptx unpacked/
# Key files:
# - ppt/slides/slide1.xml, slide2.xml, etc.
# - ppt/notesSlides/ (speaker notes)
# - ppt/media/ (images)
# 2. Edit XML
# 3. Validate
python ooxml/scripts/validate.py unpacked/ --original presentation.pptx
# 4. Pack
python ooxml/scripts/pack.py unpacked/ edited.pptx
Rearranging Slides
# Duplicate, reorder, delete slides
python scripts/rearrange.py template.pptx output.pptx 0,3,3,5,7
# Creates: slide 0, slide 3 (twice), slide 5, slide 7
Converting Between Formats
# DOCX/PPTX to PDF
soffice --headless --convert-to pdf document.docx
# PDF to images
pdftoppm -jpeg -r 150 document.pdf page
# Creates: page-1.jpg, page-2.jpg, etc.
# DOCX to Markdown
pandoc document.docx -o output.md
OCR for Scanned Documents
import pytesseract
from pdf2image import convert_from_path
images = convert_from_path('scanned.pdf')
text = ""
for image in images:
text += pytesseract.image_to_string(image)
Design Guidelines (Presentations)
Color Palettes
Pick 3-5 colors that work together:
| Palette | Colors |
|---|---|
| Classic Blue | Navy #1C2833, Slate #2E4053, Silver #AAB7B8 |
| Teal & Coral | Teal #5EA8A7, Coral #FE4447, White #FFFFFF |
| Black & Gold | Gold #BF9A4A, Black #000000, Cream #F4F6F6 |
Web-Safe Fonts Only
Arial, Helvetica, Times New Roman, Georgia, Verdana, Tahoma, Trebuchet MS, Courier New, Impact
Layout Rules
- Two-column: Use for exactly 2 distinct items
- Three-column: Use for exactly 3 items
- Never vertically stack charts below text
- Full-bleed images with text overlays work well
Dependencies
# Python
pip install pypdf pdfplumber reportlab python-docx openpyxl
# System tools
apt-get install pandoc poppler-utils libreoffice
# Node.js (for docx-js)
npm install docx
Verification
Run: python scripts/verify.py
Related Skills
working-with-spreadsheets- Excel file handlingbuilding-nextjs-apps- Frontend for document uploads
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
perigon-backend
Perigon ASP.NET Core + EF Core + Aspire conventions
perigon-agent
Pointers for Copilot/agents to apply Perigon conventions
perigon-angular
Angular 21+ standalone/Material/signal conventions for Perigon WebApp
fastapi-mastery
Comprehensive FastAPI development skill covering REST API creation, routing, request/response handling, validation, authentication, database integration, middleware, and deployment. Use when working with FastAPI projects, building APIs, implementing CRUD operations, setting up authentication/authorization, integrating databases (SQL/NoSQL), adding middleware, handling WebSockets, or deploying FastAPI applications. Triggered by requests involving .py files with FastAPI code, API endpoint creation, Pydantic models, or FastAPI-specific features.
context7-efficient
Token-efficient library documentation fetcher using Context7 MCP with 86.8% token savings through intelligent shell pipeline filtering. Fetches code examples, API references, and best practices for JavaScript, Python, Go, Rust, and other libraries. Use when users ask about library documentation, need code examples, want API usage patterns, are learning a new framework, need syntax reference, or troubleshooting with library-specific information. Triggers include questions like "Show me React hooks", "How do I use Prisma", "What's the Next.js routing syntax", or any request for library/framework documentation.
browser-use
Browser automation using Playwright MCP. Navigate websites, fill forms, click elements, take screenshots, and extract data. Use when tasks require web browsing, form submission, web scraping, UI testing, or any browser interaction.
Didn't find tool you were looking for?