Agent skill

ranking

Ranks and scores retrieved documents based on similarity metrics from vector search. Use when sorting documents by relevance, prioritizing results, or when the user mentions ranking, scoring, or ordering documents.

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/ranking

SKILL.md

Document Ranking

Instructions

Rank and score documents based on similarity metrics already computed by the Retrieval Agent. This skill operates on retrieved documents with distance/similarity information - it does NOT query ChromaDB again.

Default workflow:

Receive documents from Retrieval Agent (includes distance and similarity_score)
Call ranking functions to sort documents by relevance
Optionally filter by similarity threshold
Return ranked list for downstream processing (grading or generation)

Key functions:

python

# Rank documents by similarity score (descending)
ranked_docs = rank_documents_by_similarity(documents)

# Rank documents by distance (ascending - lower is better)
ranked_docs = rank_documents_by_distance(documents)

# Filter documents by similarity threshold
filtered_docs = filter_by_similarity_threshold(documents, threshold=0.7)

# Get top-k ranked documents
top_docs = get_top_k_documents(ranked_docs, k=10)

Similarity Metrics:

Each document from Retrieval Agent contains:

distance: Cosine distance from ChromaDB (lower = more similar)
- Range: [0, 2] for cosine distance
- 0 = identical, 2 = opposite
similarity_score: Computed as 1 - distance (higher = more similar)
- Range: [-1, 1] for cosine
- 1 = identical, -1 = opposite
- Typical useful range: [0.5, 1.0]

Ranking Strategies:

By Similarity Score (Recommended): Sort descending by similarity_score
- Higher scores = more relevant
- Intuitive: 0.9 > 0.7 > 0.5
By Distance: Sort ascending by distance
- Lower distances = more relevant
- Direct from vector search
Hybrid with Collection Priority: Rank by score within each collection, then merge

Critical: NEVER query ChromaDB

This skill operates on already-retrieved documents. The Retrieval Agent has already computed distances and similarity scores. Ranking simply sorts and filters based on these existing metrics.

Implementation: Functions should be in components/ranker.py, similar to components/grader.py.

Examples

Example 1: Basic ranking by similarity

python

from components.ranker import rank_documents_by_similarity

# Input: Documents from Retrieval Agent
# Each has: document, metadata, distance, collection, similarity_score
documents = [
    {'document': 'Laptop A...', 'similarity_score': 0.85, 'distance': 0.15, ...},
    {'document': 'Laptop B...', 'similarity_score': 0.92, 'distance': 0.08, ...},
    {'document': 'Laptop C...', 'similarity_score': 0.73, 'distance': 0.27, ...},
]

# Rank by similarity (descending)
ranked = rank_documents_by_similarity(documents)

# Output: [Laptop B (0.92), Laptop A (0.85), Laptop C (0.73)]

Example 2: Filter by threshold then rank

python

from components.ranker import filter_by_similarity_threshold, rank_documents_by_similarity

# Input: 15 documents from 3 collections (5 each)
documents = retrieve_from_chromadb("gaming laptop", collections=["catalog", "faq", "troubleshooting"])

# Filter to only high-quality matches (similarity > 0.7)
high_quality = filter_by_similarity_threshold(documents, threshold=0.7)
# Reduced from 15 to 8 documents

# Rank the high-quality matches
ranked = rank_documents_by_similarity(high_quality)

# Output: 8 documents sorted by similarity, all > 0.7

Example 3: Combined ranking and grading workflow

python

from components.ranker import rank_documents_by_similarity, get_top_k_documents
from components.grader import grade_documents, filter_relevant_documents

# Step 1: Retrieve documents (done by Retrieval Agent)
retrieved_docs = await retrieval_agent.retrieve_documents("best laptop for video editing", top_k=5)
# Retrieved 15 documents (5 per collection)

# Step 2: Rank by similarity score
ranked_docs = rank_documents_by_similarity(retrieved_docs)

# Step 3: Take top 10 for grading (reduce cost)
top_docs = get_top_k_documents(ranked_docs, k=10)

# Step 4: Grade for binary relevance
graded_docs = grade_documents("best laptop for video editing", top_docs)
relevant_docs = filter_relevant_documents(graded_docs)

# Output: Only the most relevant documents (high similarity + graded as relevant)

Example 4: Ranking within collections

python

from components.ranker import rank_by_collection

# Input: Mixed documents from multiple collections
documents = retrieve_from_chromadb("laptop warranty", collections=["catalog", "faq"])

# Rank within each collection, then combine
ranked = rank_by_collection(documents)

# Output: {
#   'catalog': [doc1 (0.88), doc2 (0.75), doc3 (0.62)],
#   'faq': [doc4 (0.95), doc5 (0.91), doc6 (0.84)]
# }

# Use this to prioritize certain collections or balance results

Distance vs Similarity Score

When to use each:

Similarity Score: Easier to understand, use for thresholds and display
- "Keep documents with similarity > 0.7"
- "Top document has 92% similarity"
Distance: Direct from vector search, use for debugging
- "ChromaDB returned distance of 0.08"
- "Check if distance < 0.3 for high confidence"

Conversion:

python

similarity_score = 1 - distance
distance = 1 - similarity_score

Typical thresholds:

similarity_score > 0.8: Very relevant
similarity_score > 0.7: Relevant
similarity_score > 0.5: Possibly relevant
similarity_score < 0.5: Likely not relevant

Integration with Grading

Ranking and grading serve different purposes:

Ranking: Sorts documents by similarity score (continuous 0-1)
- Fast, cheap (no API calls)
- Based on vector similarity alone
- Use for initial filtering and prioritization
Grading: Binary relevance with reasoning (yes/no)
- Slower, costs tokens (Claude API)
- Semantic understanding of relevance
- Use for final filtering before generation

Recommended workflow:

Retrieve documents (Retrieval Agent)
Rank by similarity (Ranking skill)
Take top-k to reduce grading cost
Grade for binary relevance (Grading skill)
Generate answer from relevant docs (Generator Agent)

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/ranking
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Document Ranking

Instructions

Examples

Example 1: Basic ranking by similarity

Example 2: Filter by threshold then rank

Example 3: Combined ranking and grading workflow

Example 4: Ranking within collections

Distance vs Similarity Score

Integration with Grading

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state