Agent skill

knowledge-ingest

Ingest URLs, documents, and text into the memory system as structured knowledge

View SKILL.md on GitHub Repository

Stars 557

Forks 72

Install this agent skill to your Project

npx add-skill https://github.com/QuixiAI/Hexis/tree/main/skills/installed/knowledge-ingest

SKILL.md

Knowledge Base Ingestion

Transform external content -- web pages, documents, raw text -- into structured semantic memories that persist in the knowledge graph.

When to Use

When the user shares a URL and says "learn this" or "remember this article"
When a research workflow finds valuable sources that should be retained long-term
When the user pastes raw text (notes, transcripts, outlines) to be ingested
During heartbeats when a goal involves building knowledge on a specific topic
When importing reference material for a project or domain

Step-by-Step Methodology

Assess the source: Before ingesting, determine what kind of content it is (article, documentation, transcript, raw notes). This guides how aggressively to summarize.
Fetch and parse: For URLs, use ingest_url which handles fetching, HTML-to-text conversion, and chunking. For raw text, use ingest_text directly.
Check for duplicates: Use recall with the URL or a key phrase from the content to see if it has already been ingested. Avoid storing the same source twice.
Chunk intelligently: Long content is automatically chunked by the ingestion pipeline. Each chunk becomes a separate semantic memory linked by source metadata. Trust the pipeline's chunking; do not manually split content unless it is clearly failing.
Add context: When storing via remember, include metadata about the source: URL, author, date published, and why it was ingested (which goal or topic it serves).
Verify ingestion: After ingestion completes, run a quick recall on a key concept from the content to confirm it is retrievable.
Connect to goals: If the ingested content relates to an active goal, note the connection so future heartbeats can leverage it.

Quality Guidelines

Prefer ingesting authoritative, primary sources over summaries or aggregators.
Do not ingest entire websites. Be selective -- ingest the specific pages that contain the needed information.
When ingesting long documents, let the chunking pipeline do its job. Each chunk retains a reference to the parent source.
Always record the source URL or origin. Memories without provenance are harder to evaluate and update later.
Respect rate limits and robots.txt when fetching URLs. If a fetch fails, note the failure and move on rather than retrying aggressively.
For sensitive or private content (internal docs, personal notes), ensure the user understands that ingested content persists in the local database.

Maintainer

QuixiAI Core maintainer

Source details

Full Name: QuixiAI/Hexis
Branch: main
Path in repo: skills/installed/knowledge-ingest
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

QuixiAI/Hexis

cost-report

Query and report on API usage costs across LLM, embedding, and tool providers

557 72

Explore

QuixiAI/Hexis

email-digest

Digest and ingest emails into memory, surfacing important threads and action items

557 72

Explore

QuixiAI/Hexis

image-gen

Generate images from text descriptions using DALL-E or compatible providers

557 72

Explore

QuixiAI/Hexis

youtube-analytics

Retrieve and analyze YouTube channel stats, video performance, and content trends

557 72

Explore

QuixiAI/Hexis

daily-briefing

Compile a comprehensive daily briefing from calendar, contacts, goals, and recent activity

557 72

Explore

QuixiAI/Hexis

research

How to research topics using web tools

557 72

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Knowledge Base Ingestion

When to Use

Step-by-Step Methodology

Quality Guidelines

Recommended Agent Skills

cost-report

email-digest

image-gen

youtube-analytics

daily-briefing

research