Agent skill

llm-pipeline

Pydantic-AI agents, RAG, embeddings for Pulse Radar knowledge extraction.

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/llm-pipeline

SKILL.md

LLM Pipeline Skill

2. Scoring (AI Judge, not heuristics - ADR-003)

score = await importance_scorer.score(message)

classification: SIGNAL (>0.6) / NOISE (<0.3)

3. Auto-trigger extraction when threshold met

if unprocessed_count >= 10: # ai_config.message_threshold await extract_knowledge_from_messages_task.kiq()

4. KnowledgeOrchestrator runs Pydantic AI agent

agent = Agent( model=model, system_prompt=get_extraction_prompt("uk"), output_type=KnowledgeExtractionOutput, # CRITICAL: structured output output_retries=5, ) result = await agent.run(messages_content)

5. Save to DB + embed

await save_topics_and_atoms(result.output) await embed_atoms_batch_task.kiq(atom_ids)

</extraction-flow>

<agent-creation>
```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel

# Provider-specific model creation
if provider.type == "ollama":
    model = OpenAIChatModel(
        model_name=agent_config.model_name,
        provider=OllamaProvider(base_url=provider.base_url),
    )
elif provider.type == "openai":
    model = OpenAIChatModel(
        model_name=agent_config.model_name,
        provider=OpenAIProvider(api_key=api_key),
    )

# Agent with structured output
agent = Agent(
    model=model,
    output_type=MyPydanticModel,  # Forces JSON schema
    system_prompt="...",
    output_retries=5,
)

await embedding_service.generate_embedding(text) await embedding_service.embed_messages_batch(session, ids, batch_size=10)

</embedding-service>

<rag-context>
```python
# SemanticSearchService uses pgvector cosine similarity
similar_atoms = await search_service.search_atoms(
    query_embedding=embedding,
    limit=5,
    threshold=0.65,  # ai_config.semantic_search
)

# RAGContextBuilder assembles context for LLM
context = await rag_builder.build_context(
    query=user_query,
    similar_atoms=similar_atoms,
    related_messages=messages,
)

Strategy	Data Type	Pulse Radar Use
RAG	Dynamic (messages, atoms)	Semantic search, history retrieval
CAG	Static (project config)	Keywords, glossary, components preloaded

Hybrid: Project context (CAG) + similar atoms (RAG) = best extraction quality. See: @references/rag.md for detailed comparison.

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/llm-pipeline
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

LLM Pipeline Skill

2. Scoring (AI Judge, not heuristics - ADR-003)

classification: SIGNAL (>0.6) / NOISE (<0.3)

3. Auto-trigger extraction when threshold met

4. KnowledgeOrchestrator runs Pydantic AI agent

5. Save to DB + embed

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state