Agent skill
chunking-strategy
Provides chunking strategies for RAG systems. Generates chunk size recommendations (256-1024 tokens), overlap percentages (10-20%), and semantic boundary detection methods. Validates semantic coherence and evaluates retrieval precision/recall metrics. Use when building retrieval-augmented generation systems, vector databases, or processing large documents.
Install this agent skill to your Project
npx add-skill https://github.com/giuseppe-trisciuoglio/developer-kit/tree/main/plugins/developer-kit-ai/skills/chunking-strategy
SKILL.md
Chunking Strategy for RAG Systems
Overview
Provides chunking strategies for RAG systems, vector databases, and document processing. Recommends chunk sizes, overlap percentages, and boundary detection methods; validates semantic coherence; evaluates retrieval metrics.
When to Use
Use when building or optimizing RAG systems, vector search pipelines, document chunking workflows, or performance-tuning existing systems with poor retrieval quality.
Instructions
Choose Chunking Strategy
Select based on document type and use case:
-
Fixed-Size Chunking (Level 1)
- Use for simple documents without clear structure
- Start with 512 tokens and 10-20% overlap
- Adjust: 256 for factoid queries, 1024 for analytical
-
Recursive Character Chunking (Level 2)
- Use for documents with structural boundaries
- Hierarchical separators: paragraphs → sentences → words
- Customize for document types (HTML, Markdown, JSON)
-
Structure-Aware Chunking (Level 3)
- Use for structured content (Markdown, code, tables, PDFs)
- Preserve semantic units: functions, sections, table blocks
- Validate structure preservation post-split
-
Semantic Chunking (Level 4)
- Use for complex documents with thematic shifts
- Embedding-based boundary detection with 0.8 similarity threshold
- Buffer size: 3-5 sentences
-
Advanced Methods (Level 5)
- Late Chunking for long-context models
- Contextual Retrieval for high-precision requirements
- Monitor computational cost vs. retrieval gain
Reference: references/strategies.md.
Implement Chunking Pipeline
-
Pre-process documents
- Analyze structure, content types, information density
- Identify multi-modal content (tables, images, code)
-
Select parameters
- Chunk size: embedding model context window / 4
- Overlap: 10-20% for most cases
- Strategy-specific settings
-
Process and validate
- Apply chunking strategy
- Validate coherence: run
evaluate_chunks.py --coherence(see below) - Test with representative documents
-
Evaluate and iterate
- Measure precision and recall
- If precision < 0.7: reduce chunk_size by 25% and re-evaluate
- If recall < 0.6: increase overlap by 10% and re-evaluate
- Monitor latency and memory usage
Reference: references/implementation.md.
Validate Chunk Quality
Run validation commands to assess chunk quality:
# Check semantic coherence (requires sentence-transformers)
python -c "
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
chunks = [...] # your chunks
embeddings = model.encode(chunks)
similarity = (embeddings @ embeddings.T).mean()
print(f'Cohesion: {similarity:.3f}') # target: 0.3-0.7
"
# Measure retrieval precision
python -c "
relevant = sum(1 for c in retrieved if c in relevant_chunks)
precision = relevant / len(retrieved)
print(f'Precision: {precision:.2f}') # target: >= 0.7
"
# Check chunk size distribution
python -c "
import numpy as np
sizes = [len(c.split()) for c in chunks]
print(f'Mean: {np.mean(sizes):.0f}, Std: {np.std(sizes):.0f}')
print(f'Min: {min(sizes)}, Max: {max(sizes)}')
"
Reference: references/evaluation.md.
Examples
Fixed-Size Chunking
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(
chunk_size=256,
chunk_overlap=25,
length_function=len
)
chunks = splitter.split_documents(documents)
Structure-Aware Code Chunking
import ast
def chunk_python_code(code):
tree = ast.parse(code)
chunks = []
for node in ast.walk(tree):
if isinstance(node, (ast.FunctionDef, ast.ClassDef)):
chunks.append(ast.get_source_segment(code, node))
return chunks
Semantic Chunking
def semantic_chunk(text, similarity_threshold=0.8):
sentences = split_into_sentences(text)
embeddings = generate_embeddings(sentences)
chunks, current = [], [sentences[0]]
for i in range(1, len(sentences)):
sim = cosine_similarity(embeddings[i-1], embeddings[i])
if sim < similarity_threshold:
chunks.append(" ".join(current))
current = [sentences[i]]
else:
current.append(sentences[i])
chunks.append(" ".join(current))
return chunks
Best Practices
Core Principles
- Balance context preservation with retrieval precision
- Maintain semantic coherence within chunks
- Optimize for embedding model context window constraints
Implementation
- Start with fixed-size (512 tokens, 15% overlap)
- Iterate based on document characteristics
- Test with domain-specific documents before deployment
Pitfalls to Avoid
- Over-chunking: context-poor small chunks
- Under-chunking: missing information in oversized chunks
- Ignoring semantic boundaries and document structure
- One-size-fits-all for diverse content types
Constraints and Warnings
Resource Considerations
- Semantic methods require significant compute resources
- Late chunking needs long-context embedding models
- Complex strategies increase processing latency
- Monitor memory for large document batches
Quality Requirements
- Validate semantic coherence post-processing
- Test with representative documents before deployment
- Ensure chunks maintain standalone meaning
- Implement error handling for malformed content
References
- strategies.md - Detailed strategies
- implementation.md - Implementation guidelines
- evaluation.md - Performance metrics
- tools.md - Libraries and frameworks
- research.md - Research papers
- advanced-strategies.md - 11 advanced methods
- semantic-methods.md - Semantic approaches
- visualization-tools.md - Visualization tools
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
aws-cli-beast
Provides advanced AWS CLI patterns for managing EC2, Lambda, S3, DynamoDB, RDS, VPC, IAM, and CloudWatch. Generates bulk operation scripts, automates cross-service workflows, validates security configurations, and executes JMESPath queries for complex filtering. Triggers on "aws cli help", "aws command line", "aws scripting", "aws automation", "aws batch operations", "aws bulk operations", "aws cli pagination", "aws multi-region", "aws profiles", "aws cli troubleshooting".
aws-cost-optimization
Provides structured AWS cost optimization guidance using five pillars (right-sizing, elasticity, pricing models, storage optimization, monitoring) and twelve actionable best practices with executable AWS CLI examples. Use when optimizing AWS costs, reviewing AWS spending, finding unused AWS resources, implementing FinOps practices, reducing EC2/EBS/S3 bills, configuring AWS Budgets, or performing AWS Well-Architected cost reviews.
aws-sam-bootstrap
Provides AWS SAM bootstrap patterns: generates `template.yaml` and `samconfig.toml` for new projects via `sam init`, creates SAM templates for existing Lambda/CloudFormation code migration, validates build/package/deploy workflows, and configures local testing with `sam local invoke`. Use when the user asks about SAM projects, `sam init`, `sam deploy`, serverless deployments, or needs to bootstrap/migrate Lambda functions with SAM templates.
aws-drawio-architecture-diagrams
Creates professional AWS architecture diagrams in draw.io XML format (.drawio files) using official AWS Architecture Icons (aws4 library). Use when the user asks for AWS diagrams, VPC layouts, multi-tier architectures, serverless designs, network topology, or draw.io exports involving Lambda, EC2, RDS, or other AWS services.
aws-cloudformation-bedrock
Provides AWS CloudFormation patterns for Amazon Bedrock resources including agents, knowledge bases, data sources, guardrails, prompts, flows, and inference profiles. Use when creating Bedrock agents with action groups, implementing RAG with knowledge bases, configuring vector stores, setting up content moderation guardrails, managing prompts, orchestrating workflows with flows, and configuring inference profiles for model optimization.
aws-cloudformation-s3
Provides AWS CloudFormation patterns for Amazon S3. Use when creating S3 buckets, policies, versioning, lifecycle rules, and implementing template structure with Parameters, Outputs, Mappings, Conditions, and cross-stack references.
Didn't find tool you were looking for?