Agent skill

libingest

libingest - Document ingestion pipeline. IngestPipeline orchestrates configurable transformation steps. IngestStep defines individual processors like pdf-to-images, images-to-html, extract-context, annotate-html. Converts PDF, PowerPoint, images to Schema.org annotated HTML. Use for document processing, knowledge extraction, and content transformation.

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/libingest

SKILL.md

libingest Skill

When to Use

Converting PDF documents to structured HTML
Processing PowerPoint presentations for indexing
Extracting semantic content from images via OCR
Building document ingestion pipelines

Key Concepts

IngestPipeline: Orchestrates a sequence of transformation steps defined in config/ingest.yml.

IngestStep: Individual processing step (pdf-to-images, images-to-html, extract-context, annotate-html, normalize-html).

Usage Patterns

Pattern 1: Run ingestion via CLI

bash

# Drop files in data/ingest/in/
cp document.pdf data/ingest/in/

# Run pipeline
make ingest

Pattern 2: Programmatic ingestion

javascript

import { IngestPipeline } from "@copilot-ld/libingest";

const pipeline = new IngestPipeline(config, storage, llmClient);
const result = await pipeline.process("document.pdf");
// result.output points to final HTML

Integration

Configured via config/ingest.yml. Uses libllm for vision processing. Output stored in data/ingest/pipeline/.

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/libingest
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

libingest Skill

When to Use

Key Concepts

Usage Patterns

Pattern 1: Run ingestion via CLI

Pattern 2: Programmatic ingestion

Integration

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state