Agent skill
surrealdb
Comprehensive SurrealDB patterns for hybrid RAG, including vector search, graph traversal, timeline queries, and combined semantic-graph-temporal operations.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/surrealdb
SKILL.md
SurrealDB Skill
This skill provides comprehensive guidance for working with SurrealDB in the Bookstrap framework, covering database patterns, query strategies, and hybrid retrieval-augmented generation (RAG).
SurrealDB Overview
SurrealDB is a multi-model database that combines:
- Document storage: JSON-like records
- Graph database: Native relationships with RELATE statements
- Vector search: Native embedding support with MTREE indexes
- SQL-like queries: Familiar SurrealQL syntax
This makes it ideal for book writing, where you need:
- Semantic search (find similar content)
- Graph traversal (character relationships, citation chains)
- Timeline queries (chronological events, sequences)
- Combined queries (semantic + graph + timeline)
Core Database Patterns
1. Hybrid RAG Architecture
Hybrid RAG combines three retrieval strategies:
Semantic Search + Graph Traversal + Timeline Queries
(vectors) (relationships) (sequences)
↓ ↓ ↓
Combined Context
↓
Writing/Editing
Why Hybrid?
- Semantic alone: Misses explicit relationships
- Graph alone: Misses conceptual similarity
- Timeline alone: Misses both semantics and structure
- Combined: Comprehensive context retrieval
2. Vector Search Pattern
SurrealDB uses MTREE indexes for efficient k-nearest neighbor (kNN) search.
Basic Vector Search:
-- Find similar sections by semantic meaning
SELECT *, vector::similarity::cosine(embedding, $query_vector) AS similarity
FROM section
WHERE vector::similarity::cosine(embedding, $query_vector) > 0.7
ORDER BY similarity DESC
LIMIT 5;
Vector Index Configuration:
DEFINE INDEX idx_section_embedding
ON section
FIELDS embedding
MTREE DIMENSION 1536 DIST COSINE TYPE F32;
Important: Embedding dimensions must match your provider:
- Gemini
text-embedding-004: 768 dims - OpenAI
text-embedding-3-small: 1536 dims - Ollama
nomic-embed-text: 768 dims
3. Graph Traversal Pattern
Use -> and <- operators to traverse relationships.
Outbound Traversal (following relationships):
-- Get all sections where character Anna appears
SELECT ->appears_in->section.*
FROM character:anna;
Inbound Traversal (reverse relationships):
-- Get all characters in a section
SELECT <-appears_in<-character.*
FROM section:ch3_s2;
Multi-hop Traversal:
-- Get characters known by Anna's friends
SELECT ->knows->character->knows->character.*
FROM character:anna;
4. Timeline Query Pattern
Use sequence fields and datetime for chronological ordering.
Sequence-Based:
-- Get events before current point
SELECT * FROM event
WHERE sequence < $current_sequence
ORDER BY sequence DESC
LIMIT 10;
Date-Based:
-- Get events in date range
SELECT * FROM event
WHERE date >= $start_date
AND date <= $end_date
ORDER BY date ASC;
Query Strategy Selection
Choose query strategy based on retrieval goal:
| Goal | Strategy | Example Query |
|---|---|---|
| Find similar content | Vector search | "Find sections about training" |
| Get related entities | Graph traversal | "Get Anna's relationships" |
| Timeline context | Sequence query | "Events before chapter 5" |
| Comprehensive context | Hybrid (all three) | "Anna's training: similar content + relationships + timeline" |
Pre-Write Query Workflow
Before writing a section, gather context using hybrid queries:
Step 1: Semantic Search
Find conceptually similar content:
-- Find related sections by theme
SELECT *, vector::similarity::cosine(embedding, $query_embedding) AS similarity
FROM section
WHERE vector::similarity::cosine(embedding, $query_embedding) > 0.7
ORDER BY similarity DESC
LIMIT 5;
Step 2: Graph Context
Get related entities and relationships:
-- Get characters and their relationships
SELECT
c.name,
c.description,
->knows->character.name AS knows,
->appears_in->section.id AS appears_in
FROM character c
WHERE c.id IN $character_ids;
Step 3: Timeline Context
Get chronological context:
-- Get recent events
SELECT * FROM event
WHERE sequence < $current_sequence
ORDER BY sequence DESC
LIMIT 10;
Step 4: Citation Chain
Get source support:
-- Get sources for concept
SELECT
<-supports<-source.title,
<-supports<-source.reliability,
<-supports<-source.url
FROM concept:wireless_protocols;
Hybrid Query Patterns
Combine multiple strategies for comprehensive context.
Pattern 1: Semantic + Graph
-- Find similar sections that mention Anna
SELECT
s.*,
vector::similarity::cosine(s.embedding, $query_embedding) AS similarity
FROM section s
WHERE vector::similarity::cosine(s.embedding, $query_embedding) > 0.7
AND s->appears_in->character:anna
ORDER BY similarity DESC
LIMIT 5;
Pattern 2: Semantic + Timeline
-- Find similar sections before current point
SELECT
s.*,
vector::similarity::cosine(s.embedding, $query_embedding) AS similarity
FROM section s
WHERE vector::similarity::cosine(s.embedding, $query_embedding) > 0.7
AND s.sequence < $current_sequence
ORDER BY similarity DESC, s.sequence DESC
LIMIT 5;
Pattern 3: Graph + Timeline
-- Get Anna's appearances in chronological order
SELECT
s.*,
s.sequence
FROM character:anna->appears_in->section s
WHERE s.sequence < $current_sequence
ORDER BY s.sequence ASC;
Pattern 4: Full Hybrid (Semantic + Graph + Timeline)
-- Comprehensive context query
SELECT
s.*,
vector::similarity::cosine(s.embedding, $query_embedding) AS similarity,
s.sequence
FROM section s
WHERE vector::similarity::cosine(s.embedding, $query_embedding) > 0.7
AND s->appears_in->character IN $character_ids
AND s.sequence < $current_sequence
ORDER BY similarity DESC, s.sequence DESC
LIMIT 10;
Entity Extraction and Storage
When ingesting sources or writing sections, extract and store entities.
Character Extraction
-- Create character with embedding
CREATE character SET
name = "Anna",
description = "SOE wireless operator, recruited 1942",
embedding = $character_embedding,
status = "alive",
introduced_in = section:ch1_s3;
Location Extraction
-- Create location
CREATE location SET
name = "Beaulieu Manor",
description = "SOE training facility in Hampshire",
embedding = $location_embedding,
introduced = true;
Event Extraction
-- Create event with timeline info
CREATE event SET
name = "Anna begins training",
description = "Wireless operator course starts at Beaulieu",
embedding = $event_embedding,
sequence = 5,
date = "1942-08-15T00:00:00Z";
Relationship Creation
-- Link character to section
RELATE character:anna->appears_in->section:ch3_s2;
-- Link section to location
RELATE section:ch3_s2->located_in->location:beaulieu;
-- Link section to source
RELATE section:ch3_s2->cites->source:soe_manual;
-- Timeline relationship
RELATE event:training_begins->precedes->event:deployment;
Query Optimization
1. Use Vector Indexes
Ensure MTREE indexes exist for all embedding fields:
-- Check existing indexes
INFO FOR TABLE section;
-- Create missing index
DEFINE INDEX idx_section_embedding
ON section
FIELDS embedding
MTREE DIMENSION 1536 DIST COSINE TYPE F32;
2. Filter Before Vector Search
Apply filters before vector operations when possible:
-- Efficient: Filter first, then vector search
SELECT * FROM section
WHERE chapter = $chapter_id
AND vector::similarity::cosine(embedding, $query_vector) > 0.7;
-- Less efficient: Vector search entire table
SELECT * FROM section
WHERE vector::similarity::cosine(embedding, $query_vector) > 0.7
AND chapter = $chapter_id;
3. Limit Results Early
Use LIMIT to reduce processing:
SELECT * FROM section
WHERE vector::similarity::cosine(embedding, $query_vector) > 0.7
ORDER BY vector::similarity::cosine(embedding, $query_vector) DESC
LIMIT 5; -- Only process top 5
4. Use LET for Reusable Queries
Store intermediate results:
-- Store character list
LET $characters = (
SELECT * FROM character WHERE id IN $character_ids
);
-- Reuse in multiple queries
SELECT * FROM section
WHERE ->appears_in->character IN $characters;
Common Query Patterns
Find Uncited Claims
SELECT * FROM section
WHERE count(->cites->source) = 0
AND length(content) > 100;
Get Citation Chain
SELECT
s.id,
s.content,
->cites->source.title AS sources,
->cites->source.reliability AS reliability
FROM section s
WHERE s.id = $section_id;
Character Relationship Graph
SELECT
c1.name AS character,
->knows->character.name AS knows,
<-knows<-character.name AS known_by
FROM character c1;
Timeline Consistency Check
-- Find events with contradictory sequence and dates
SELECT
e1.name,
e1.sequence,
e1.date,
e2.name,
e2.sequence,
e2.date
FROM event e1, event e2
WHERE e1.sequence < e2.sequence
AND e1.date > e2.date;
Database Schema Reference
See schema.surql for the complete database schema including:
- Table definitions (chapter, section, source, character, location, event, concept)
- Field types and constraints
- Vector indexes (MTREE configuration)
- Edge tables (appears_in, located_in, cites, supports, precedes, follows, knows, related_to)
Query Pattern Files
This skill includes specialized query pattern files:
semantic.surql: Vector search patterns with examplesgraph.surql: Graph traversal patterns for relationshipstimeline.surql: Timeline and sequence query patterns
Load these files when you need specific query types.
Best Practices
- Always use embeddings: Generate embeddings for all content
- Create relationships: Use RELATE for explicit connections
- Track sequences: Maintain timeline order with sequence fields
- Verify indexes: Ensure MTREE indexes exist before vector queries
- Combine strategies: Use hybrid queries for comprehensive context
- Limit results: Don't retrieve more than you need
- Store metadata: Track created_at, updated_at for debugging
- Use transactions: Group related operations for consistency
Error Handling
Common Errors
Vector dimension mismatch:
Error: Vector dimensions do not match (expected 1536, got 768)
Fix: Ensure config embedding dimensions match schema indexes.
Missing index:
Error: No vector index found on field 'embedding'
Fix: Create MTREE index with DEFINE INDEX.
Invalid relationship:
Error: Cannot RELATE records of incompatible types
Fix: Verify edge table schema matches record types.
Supporting Files
schema.surql— Complete database schema referencesemantic.surql— Vector search query patternsgraph.surql— Graph traversal examplestimeline.surql— Timeline query patterns
Didn't find tool you were looking for?