Agent skill

oma-pdf

Convert PDF files to Markdown using opendataloader-pdf. Extracts text, tables, headings, lists, and images with correct reading order. Use for PDF parsing, PDF to Markdown conversion, document extraction, and AI-ready data preparation.

View SKILL.md on GitHub Repository

Stars 610

Forks 71

Install this agent skill to your Project

npx add-skill https://github.com/first-fluke/oh-my-agent/tree/main/.agents/skills/oma-pdf

SKILL.md

PDF Skill - PDF to Markdown Conversion

When to use

Converting PDF documents to Markdown for LLM context or RAG
Extracting structured content (tables, headings, lists) from PDFs
Preparing PDF data for AI consumption
User says "convert this PDF", "parse PDF", "PDF to markdown", "read this PDF"

When NOT to use

Generating or creating PDFs -> use appropriate document tools
Editing existing PDFs -> out of scope
Simple file reading of already-text files -> use Read tool directly

Core Rules

Use uvx opendataloader-pdf to run — no installation required
Default output format is Markdown
If no output directory specified, output to the same directory as the input PDF
Preserve document structure: headings, tables, lists, images
For scanned PDFs, use hybrid mode with OCR
Always run uvx mdformat on the output to normalize Markdown formatting
Validate the output Markdown is readable and well-structured
Report any conversion issues (missing tables, garbled text) to the user

How to Execute

Follow resources/execution-protocol.md step by step.

Quick Reference

Basic conversion (single file)

bash

uvx opendataloader-pdf input.pdf

Specify output directory

bash

uvx opendataloader-pdf input.pdf --output-dir ./output/

Multiple files or folder

bash

uvx opendataloader-pdf file1.pdf file2.pdf folder/

With OCR (scanned PDFs)

Requires hybrid mode server:

bash

uvx opendataloader-pdf-hybrid --port 5002 --force-ocr --ocr-lang "ko,en"
uvx opendataloader-pdf --hybrid docling-fast input.pdf

With image extraction (embedded base64)

bash

uvx opendataloader-pdf input.pdf --image-output embedded --image-format png

With Tagged PDF structure

bash

uvx opendataloader-pdf input.pdf --use-struct-tree

Output Formats

Format	Flag	Use case
Markdown	`--format markdown`	Default. Clean text for LLM/RAG
JSON	`--format json`	Structured data with bounding boxes
HTML	`--format html`	Web display
Text	`--format text`	Plain text extraction
Combined	`--format markdown,json`	Multiple formats at once

Configuration

Project-specific settings: config/pdf-config.yaml

Troubleshooting

Issue	Solution
Garbled text in output	Try `--use-struct-tree` for Tagged PDFs
Scanned PDF (no text layer)	Use hybrid mode with `--force-ocr`
Tables not extracted properly	Use hybrid mode for complex/borderless tables
Non-English PDF	Add `--ocr-lang` with appropriate language codes
Large PDF (100+ pages)	Process in page ranges or use batch mode
Formula not extracted	Use hybrid mode with `--enrich-formula`

References

Execution steps: resources/execution-protocol.md
Configuration: config/pdf-config.yaml
Context loading: ../_shared/core/context-loading.md
Quality principles: ../_shared/core/quality-principles.md

Maintainer

first-fluke Core maintainer

Source details

Full Name: first-fluke/oh-my-agent
Branch: main
Path in repo: .agents/skills/oma-pdf
License: MIT License
Topics: claude-code agent-skills ai-agents cursor agentic-coding codex opencode multi-agent-systems multi-agent orchestration agent-harness orchestrator serena oh-my-agent

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

first-fluke/oh-my-agent

oma-mobile

Mobile specialist for Flutter, React Native, and cross-platform mobile development. Use for mobile app, Flutter, Dart, iOS, Android, Riverpod, and widget work.

610 71

Explore

first-fluke/oh-my-agent

oma-frontend

Frontend specialist for React, Next.js, TypeScript with FSD-lite architecture, shadcn/ui, and design system alignment. Use for UI, component, page, layout, CSS, Tailwind, and shadcn work.

610 71

Explore

first-fluke/oh-my-agent

oma-backend

Backend specialist for APIs, databases, authentication with clean architecture (Repository/Service/Router pattern). Use for API, endpoint, REST, database, server, migration, and auth work.

610 71

Explore

first-fluke/oh-my-agent

oma-brainstorm

Design-first ideation that explores user intent, constraints, and approaches before any planning or implementation. Use for brainstorming, ideation, exploring concepts, and evaluating approaches.

610 71

Explore

first-fluke/oh-my-agent

oma-scm

SCM (software configuration management) and Git — branching, merges, conflicts, worktrees, baselines, audit readiness, plus Conventional Commits and safe staging.

610 71

Explore

first-fluke/oh-my-agent

oma-translator

Context-aware translation that preserves tone, style, and natural word order. Use when translating UI strings, documentation, marketing copy, or any multilingual content. Infers register, domain, and style from the source text and surrounding codebase context.

610 71

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

PDF Skill - PDF to Markdown Conversion

When to use

When NOT to use

Core Rules

How to Execute

Quick Reference

Basic conversion (single file)

Specify output directory

Multiple files or folder

With OCR (scanned PDFs)

With image extraction (embedded base64)

With Tagged PDF structure

Output Formats

Configuration

Troubleshooting

References

Recommended Agent Skills

oma-mobile

oma-frontend

oma-backend

oma-brainstorm

oma-scm

oma-translator