Agent skill
semantic-search-setup
Setup vector embeddings and semantic search for document collections. Use for AI-powered similarity search, finding related documents, and preparing knowledge bases for RAG systems.
Install this agent skill to your Project
npx add-skill https://github.com/vamseeachanta/workspace-hub/tree/main/.claude/skills/data/documents/semantic-search-setup
SKILL.md
Semantic Search Setup
Overview
This skill sets up vector embedding infrastructure for semantic search. Unlike keyword search (FTS5), semantic search finds conceptually similar content even without exact word matches.
Quick Start
from sentence_transformers import SentenceTransformer
import numpy as np
model = SentenceTransformer('all-MiniLM-L6-v2')
# Generate embeddings
texts = ["How to fix a bug", "Debugging software issues"]
embeddings = model.encode(texts, normalize_embeddings=True)
# Compute similarity
similarity = np.dot(embeddings[0], embeddings[1])
print(f"Similarity: {similarity:.3f}") # ~0.85
When to Use
- Adding AI-powered search to document collections
- Finding conceptually related documents
- Preparing knowledge bases for RAG Q&A systems
- Building recommendation systems
- Enabling "more like this" functionality
Related Skills
knowledge-base-builder- Build the document database firstrag-system-builder- Add AI Q&A on top of semantic searchpdf/text-extractor- Extract text from PDFs
Version History
- 1.1.0 (2026-01-02): Added Quick Start, Execution Checklist, Error Handling, Metrics sections; updated frontmatter with version, category, related_skills
- 1.0.0 (2024-10-15): Initial release with sentence-transformers, cosine similarity search, batch processing
Sub-Skills
- Best Practices
Sub-Skills
- Execution Checklist
- Error Handling
- Metrics
- Dependencies
Sub-Skills
- How Semantic Search Works
- Model Selection
- Step 1: Install Dependencies (+5)
- 1. CPU vs GPU (+3)
- Status Monitoring
- Example Usage
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
gsd-complete-milestone
Archive completed milestone and prepare for next version
gsd-reapply-patches
Reapply local modifications after a GSD update
gsd-verify-work
Validate built features through conversational UAT
gsd-thread
Manage persistent context threads for cross-session work
clinical-trial-protocol
Generate clinical trial protocols for medical devices or drugs through a modular, waypoint-based architecture with research-only and full protocol modes.
single-cell-rna-qc
Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations.
Didn't find tool you were looking for?