Agent skill

Vector Database Patterns

Comprehensive guide to vector databases including Pinecone, Qdrant, Weaviate, embedding strategies, and similarity search.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/vector-database

SKILL.md

Vector Database Patterns

Overview

Vector databases are specialized databases designed to store, index, and query high-dimensional vectors efficiently. They enable similarity search by finding vectors that are "closest" to a query vector using various distance metrics. This skill covers Pinecone, Qdrant, Weaviate, embedding strategies, similarity search, performance optimization, and production considerations.

Prerequisites

  • Understanding of vectors and embeddings
  • Knowledge of machine learning concepts
  • Familiarity with Python or TypeScript
  • Understanding of similarity metrics (cosine, Euclidean, dot product)
  • Basic knowledge of database concepts

Key Concepts

Vector Database Fundamentals

  • Vectors: Numerical representations of data (text, images, audio) in high-dimensional space
  • Embeddings: Vectors generated by machine learning models that capture semantic meaning
  • Distance Metrics: Measures of similarity between vectors (cosine, Euclidean, dot product)
  • Indexing: Data structures that enable fast similarity search
  • Metadata: Additional information associated with vectors for filtering

Vector Database Types

  • Pinecone: Managed service, easy setup, good for production
  • Qdrant: Open-source, self-hosted option, flexible
  • Weaviate: Open-source, GraphQL API, good for multimodal

Use Cases

  • Semantic search (finding similar documents, products, images)
  • Recommendation systems
  • Anomaly detection
  • Natural language processing tasks
  • Computer vision applications
  • Personalization engines
  • Knowledge retrieval for RAG (Retrieval-Augmented Generation)

Implementation Guide

Pinecone

Setup and Indexing

python
# Install Pinecone client
# pip install pinecone-client

import pinecone
from pinecone import Pinecone, ServerlessSpec

# Initialize Pinecone
pc = Pinecone(api_key="your-api-key")

# Create index
pc.create_index(
    name="my-index",
    dimension=1536,  # OpenAI embedding dimension
    metric="cosine",  # or "euclidean", "dotproduct"
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

# Connect to index
index = pc.Index("my-index")

# Check index stats
stats = index.describe_index_stats()
print(f"Total vectors: {stats['total_vector_count']}")
print(f"Dimension: {stats['dimension']}")
typescript
// Install Pinecone client
// npm install @pinecone-database/pinecone

import { Pinecone } from '@pinecone-database/pinecone';

// Initialize Pinecone
const pinecone = new Pinecone({
  apiKey: 'your-api-key'
});

// Create index
await pinecone.createIndex({
  name: 'my-index',
  dimension: 1536,
  metric: 'cosine',
  spec: {
    serverless: {
      cloud: 'aws',
      region: 'us-east-1'
    }
  }
});

// Connect to index
const index = pinecone.index('my-index');

// Check index stats
const stats = await index.describeIndexStats();
console.log('Total vectors:', stats.totalVectorCount);
console.log('Dimension:', stats.dimension);

Upserting Vectors

python
# Upsert single vector
index.upsert(
    vectors=[
        {
            "id": "doc1",
            "values": [0.1, 0.2, 0.3, ...],  # 1536-dimensional vector
            "metadata": {
                "title": "Document 1",
                "category": "technology",
                "date": "2024-01-01"
            }
        }
    ]
)

# Upsert multiple vectors
index.upsert(
    vectors=[
        {
            "id": "doc1",
            "values": vector1,
            "metadata": {"title": "Document 1", "category": "tech"}
        },
        {
            "id": "doc2",
            "values": vector2,
            "metadata": {"title": "Document 2", "category": "science"}
        },
        {
            "id": "doc3",
            "values": vector3,
            "metadata": {"title": "Document 3", "category": "tech"}
        }
    ],
    namespace="documents"
)

# Upsert in batches
from tqdm import tqdm

def upsert_in_batches(vectors, batch_size=100):
    for i in tqdm(range(0, len(vectors), batch_size)):
        batch = vectors[i:i + batch_size]
        index.upsert(vectors=batch)
typescript
// Upsert single vector
await index.upsert([
  {
    id: 'doc1',
    values: [0.1, 0.2, 0.3, ...], // 1536-dimensional vector
    metadata: {
      title: 'Document 1',
      category: 'technology',
      date: '2024-01-01'
    }
  }
]);

// Upsert multiple vectors
await index.upsert([
  {
    id: 'doc1',
    values: vector1,
    metadata: { title: 'Document 1', category: 'tech' }
  },
  {
    id: 'doc2',
    values: vector2,
    metadata: { title: 'Document 2', category: 'science' }
  },
  {
    id: 'doc3',
    values: vector3,
    metadata: { title: 'Document 3', category: 'tech' }
  }
]);

// Upsert with namespace
await index.upsert([
  {
    id: 'doc1',
    values: vector1,
    metadata: { title: 'Document 1' }
  }
], 'documents');

Querying

python
# Basic similarity search
results = index.query(
    vector=query_vector,
    top_k=10,
    include_metadata=True,
    include_values=False
)

for match in results['matches']:
    print(f"ID: {match['id']}, Score: {match['score']}")
    print(f"Metadata: {match['metadata']}")

# Query with namespace
results = index.query(
    vector=query_vector,
    top_k=10,
    namespace="documents",
    include_metadata=True
)

# Query with filter
results = index.query(
    vector=query_vector,
    top_k=10,
    filter={
        "category": {"$eq": "technology"},
        "date": {"$gte": "2024-01-01"}
    },
    include_metadata=True
)

# Query with complex filter
results = index.query(
    vector=query_vector,
    top_k=10,
    filter={
        "$or": [
            {"category": {"$eq": "technology"}},
            {"category": {"$eq": "science"}}
        ],
        "date": {"$gte": "2024-01-01"}
    },
    include_metadata=True
)
typescript
// Basic similarity search
const results = await index.query({
  vector: queryVector,
  topK: 10,
  includeMetadata: true,
  includeValues: false
});

results.matches.forEach(match => {
  console.log(`ID: ${match.id}, Score: ${match.score}`);
  console.log('Metadata:', match.metadata);
});

// Query with namespace
const results = await index.query({
  vector: queryVector,
  topK: 10,
  namespace: 'documents',
  includeMetadata: true
});

// Query with filter
const results = await index.query({
  vector: queryVector,
  topK: 10,
  filter: {
    category: { $eq: 'technology' },
    date: { $gte: '2024-01-01' }
  },
  includeMetadata: true
});

// Query with complex filter
const results = await index.query({
  vector: queryVector,
  topK: 10,
  filter: {
    $or: [
      { category: { $eq: 'technology' } },
      { category: { $eq: 'science' } }
    ],
    date: { $gte: '2024-01-01' }
  },
  includeMetadata: true
});

Deleting Vectors

python
# Delete single vector
index.delete(ids=["doc1"])

# Delete multiple vectors
index.delete(ids=["doc1", "doc2", "doc3"])

# Delete all vectors in namespace
index.delete(delete_all=True, namespace="documents")

# Delete by filter
index.delete(
    filter={
        "category": {"$eq": "old"},
        "date": {"$lt": "2023-01-01"}
    },
    namespace="documents"
)
typescript
// Delete single vector
await index.deleteOne('doc1');

// Delete multiple vectors
await index.deleteMany(['doc1', 'doc2', 'doc3']);

// Delete all vectors in namespace
await index.deleteAll({ namespace: 'documents' });

// Delete by filter
await index.deleteMany({
  filter: {
    category: { $eq: 'old' },
    date: { $lt: '2023-01-01' }
  },
  namespace: 'documents'
});

Qdrant

Collections and Points

python
# Install Qdrant client
# pip install qdrant-client

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

# Initialize Qdrant client
client = QdrantClient(url="http://localhost:6333")

# Create collection
client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(
        size=1536,
        distance=Distance.COSINE  # or Distance.EUCLID, Distance.DOT
    )
)

# Create collection with multiple vectors
client.create_collection(
    collection_name="multimodal",
    vectors_config={
        "text": VectorParams(size=1536, distance=Distance.COSINE),
        "image": VectorParams(size=512, distance=Distance.EUCLID)
    }
)

# List collections
collections = client.get_collections()
for collection in collections.collections:
    print(f"Collection: {collection.name}")

# Get collection info
info = client.get_collection("documents")
print(f"Vectors count: {info.vectors_count}")
print(f"Points count: {info.points_count}")
typescript
// Install Qdrant client
// npm install @qdrant/js-client-rest

import { QdrantClient } from '@qdrant/js-client-rest';

// Initialize Qdrant client
const client = new QdrantClient({
  url: 'http://localhost:6333'
});

// Create collection
await client.createCollection('documents', {
  vectors: {
    size: 1536,
    distance: 'Cosine' // or 'Euclid', 'Dot'
  }
});

// Create collection with multiple vectors
await client.createCollection('multimodal', {
  vectors: {
    text: { size: 1536, distance: 'Cosine' },
    image: { size: 512, distance: 'Euclid' }
  }
});

// List collections
const collections = await client.getCollections();
collections.collections.forEach(collection => {
  console.log('Collection:', collection.name);
});

// Get collection info
const info = await client.getCollection('documents');
console.log('Vectors count:', info.vectorsCount);
console.log('Points count:', info.pointsCount);

Inserting Points

python
# Insert single point
client.upsert(
    collection_name="documents",
    points=[
        PointStruct(
            id=1,
            vector=[0.1, 0.2, 0.3, ...],
            payload={
                "title": "Document 1",
                "category": "technology",
                "date": "2024-01-01"
            }
        )
    ]
)

# Insert multiple points
client.upsert(
    collection_name="documents",
    points=[
        PointStruct(id=1, vector=vector1, payload={"title": "Doc 1", "category": "tech"}),
        PointStruct(id=2, vector=vector2, payload={"title": "Doc 2", "category": "science"}),
        PointStruct(id=3, vector=vector3, payload={"title": "Doc 3", "category": "tech"}),
    ]
)

# Insert in batches
from qdrant_client.models import Batch

def insert_in_batches(points, batch_size=100):
    for i in range(0, len(points), batch_size):
        batch = points[i:i + batch_size]
        client.upsert(
            collection_name="documents",
            points=Batch(
                ids=[p.id for p in batch],
                vectors=[p.vector for p in batch],
                payloads=[p.payload for p in batch]
            )
        )
typescript
// Insert single point
await client.upsert('documents', {
  points: [{
    id: 1,
    vector: [0.1, 0.2, 0.3, ...],
    payload: {
      title: 'Document 1',
      category: 'technology',
      date: '2024-01-01'
    }
  }]
});

// Insert multiple points
await client.upsert('documents', {
  points: [
    { id: 1, vector: vector1, payload: { title: 'Doc 1', category: 'tech' } },
    { id: 2, vector: vector2, payload: { title: 'Doc 2', category: 'science' } },
    { id: 3, vector: vector3, payload: { title: 'Doc 3', category: 'tech' } }
  ]
});

// Insert named vectors
await client.upsert('multimodal', {
  points: [{
    id: 1,
    vector: {
      text: textVector,
      image: imageVector
    },
    payload: {
      title: 'Document 1',
      type: 'multimodal'
    }
  }]
});

Querying

python
# Basic search
results = client.search(
    collection_name="documents",
    query_vector=query_vector,
    limit=10,
    with_payload=True
)

for result in results:
    print(f"ID: {result.id}, Score: {result.score}")
    print(f"Payload: {result.payload}")

# Search with filter
results = client.search(
    collection_name="documents",
    query_vector=query_vector,
    query_filter=Filter(
        must=[
            FieldCondition(
                key="category",
                match=MatchValue(value="technology")
            ),
            FieldCondition(
                key="date",
                range=Range(
                    gte="2024-01-01"
                )
            )
        ]
    ),
    limit=10,
    with_payload=True
)

# Search with named vector
results = client.search(
    collection_name="multimodal",
    query_vector=NamedVector(
        name="text",
        vector=query_vector
    ),
    limit=10
)

# Hybrid search (vector + keyword)
from qdrant_client.models import SearchRequest

results = client.search_batch(
    collection_name="documents",
    requests=[
        SearchRequest(
            vector=NamedVector(name="text", vector=query_vector),
            limit=10,
            with_payload=True
        ),
        SearchRequest(
            vector=NamedVector(name="image", vector=image_query_vector),
            limit=10,
            with_payload=True
        )
    ]
)
typescript
// Basic search
const results = await client.search('documents', {
  vector: queryVector,
  limit: 10,
  withPayload: true
});

results.forEach(result => {
  console.log(`ID: ${result.id}, Score: ${result.score}`);
  console.log('Payload:', result.payload);
});

// Search with filter
const results = await client.search('documents', {
  vector: queryVector,
  queryFilter: {
    must: [
      {
        key: 'category',
        match: { value: 'technology' }
      },
      {
        key: 'date',
        range: { gte: '2024-01-01' }
      }
    ]
  },
  limit: 10,
  withPayload: true
});

// Search with named vector
const results = await client.search('multimodal', {
  vector: {
    name: 'text',
    vector: queryVector
  },
  limit: 10
});

// Hybrid search
const results = await client.searchBatch('documents', [
  {
    vector: {
      name: 'text',
      vector: queryVector
    },
    limit: 10
  },
  {
    vector: {
      name: 'image',
      vector: imageQueryVector
    },
    limit: 10
  }
]);

Filtering

python
# Exact match filter
filter = Filter(
    must=[
        FieldCondition(
            key="category",
            match=MatchValue(value="technology")
        )
    ]
)

# Range filter
filter = Filter(
    must=[
        FieldCondition(
            key="price",
            range=Range(
                gte=100,
                lte=1000
            )
        )
    ]
)

# OR filter
filter = Filter(
    should=[
        FieldCondition(
            key="category",
            match=MatchValue(value="technology")
        ),
        FieldCondition(
            key="category",
            match=MatchValue(value="science")
        )
    ],
    min_count=1
)

# Nested filter
filter = Filter(
    must=[
        FieldCondition(
            key="metadata.category",
            match=MatchValue(value="technology")
        )
    ]
)

# Is NULL filter
filter = Filter(
    must_not=[
        FieldCondition(
            key="deleted_at",
            is_null=True
        )
    ]
)
typescript
// Exact match filter
const filter = {
  must: [
    {
      key: 'category',
      match: { value: 'technology' }
    }
  ]
};

// Range filter
const filter = {
  must: [
    {
      key: 'price',
      range: { gte: 100, lte: 1000 }
    }
  ]
};

// OR filter
const filter = {
  should: [
    {
      key: 'category',
      match: { value: 'technology' }
    },
    {
      key: 'category',
      match: { value: 'science' }
    }
  ],
  minCount: 1
};

// Nested filter
const filter = {
  must: [
    {
      key: 'metadata.category',
      match: { value: 'technology' }
    }
  ]
};

Weaviate

Schema Setup

python
# Install Weaviate client
# pip install weaviate-client

import weaviate
from weaviate import Client

# Initialize Weaviate client
client = Client("http://localhost:8080")

# Define schema
schema = {
    "classes": [
        {
            "class": "Document",
            "description": "A document",
            "vectorizer": "text2vec-openai",
            "properties": [
                {
                    "name": "title",
                    "dataType": ["string"],
                    "description": "The title of document"
                },
                {
                    "name": "content",
                    "dataType": ["text"],
                    "description": "The content of document"
                },
                {
                    "name": "category",
                    "dataType": ["string"],
                    "description": "The category of document"
                },
                {
                    "name": "date",
                    "dataType": ["date"],
                    "description": "The date of document"
                },
                {
                    "name": "metadata",
                    "dataType": ["object"],
                    "description": "Additional metadata"
                }
            ]
        }
    ]
}

# Create schema
client.schema.create(schema)

# Get schema
schema = client.schema.get()
print(schema)
typescript
// Install Weaviate client
// npm install weaviate-ts-client

import weaviate, { WeaviateClient } from 'weaviate-ts-client';

// Initialize Weaviate client
const client: WeaviateClient = weaviate.client({
  scheme: 'http',
  host: 'localhost:8080',
});

// Define schema
const schema = {
  classes: [
    {
      class: 'Document',
      description: 'A document',
      vectorizer: 'text2vec-openai',
      properties: [
        {
          name: 'title',
          dataType: ['string'],
          description: 'The title of document'
        },
        {
          name: 'content',
          dataType: ['text'],
          description: 'The content of document'
        },
        {
          name: 'category',
          dataType: ['string'],
          description: 'The category of document'
        },
        {
          name: 'date',
          dataType: ['date'],
          description: 'The date of document'
        },
        {
          name: 'metadata',
          dataType: ['object'],
          description: 'Additional metadata'
        }
      ]
    }
  ]
};

// Create schema
await client.schema
  .creator()
  .withClass(schema.classes[0])
  .do();

// Get schema
const retrievedSchema = await client.schema.getter().do();
console.log(retrievedSchema);

Inserting Data

python
# Insert single object
client.data_object.create(
    class_name="Document",
    data_object={
        "title": "Document 1",
        "content": "This is content of document 1",
        "category": "technology",
        "date": "2024-01-01T00:00:00Z",
        "metadata": {
            "author": "John Doe",
            "tags": ["tech", "ai"]
        }
    }
)

# Insert multiple objects
objects = [
    {
        "title": "Document 1",
        "content": "Content 1",
        "category": "technology"
    },
    {
        "title": "Document 2",
        "content": "Content 2",
        "category": "science"
    }
]

for obj in objects:
    client.data_object.create(
        class_name="Document",
        data_object=obj
    )

# Insert with custom vector
client.data_object.create(
    class_name="Document",
    data_object={
        "title": "Document 1",
        "content": "Content 1"
    },
    vector=[0.1, 0.2, 0.3, ...]
)

# Batch insert
from weaviate.batch import Batch

with Batch(client) as batch:
    for obj in objects:
        batch.add_data_object(
            data_object=obj,
            class_name="Document"
        )
typescript
// Insert single object
await client.data
  .creator()
  .withClassName('Document')
  .withProperties({
    title: 'Document 1',
    content: 'This is content of document 1',
    category: 'technology',
    date: '2024-01-01T00:00:00Z',
    metadata: {
      author: 'John Doe',
      tags: ['tech', 'ai']
    }
  })
  .do();

// Insert multiple objects
const objects = [
  {
    title: 'Document 1',
    content: 'Content 1',
    category: 'technology'
  },
  {
    title: 'Document 2',
    content: 'Content 2',
    category: 'science'
  }
];

for (const obj of objects) {
  await client.data
    .creator()
    .withClassName('Document')
    .withProperties(obj)
    .do();
}

// Insert with custom vector
await client.data
  .creator()
  .withClassName('Document')
  .withProperties({
    title: 'Document 1',
    content: 'Content 1'
  })
  .withVector([0.1, 0.2, 0.3, ...])
  .do();

Querying

python
# Semantic search
results = client.query.get(
    class_name="Document",
    properties=["title", "content", "category"]
).with_near_text({
    "concepts": ["artificial intelligence"],
    "distance": 0.7
}).with_limit(10).do()

for result in results["data"]["Get"]["Document"]:
    print(f"Title: {result['title']}")
    print(f"Distance: {result['_additional']['distance']}")

# Hybrid search (BM25 + vector)
results = client.query.get(
    class_name="Document",
    properties=["title", "content"]
).with_hybrid(
    query="artificial intelligence",
    alpha=0.7,  # 0 = pure BM25, 1 = pure vector
    vector=query_vector
).with_limit(10).do()

# Filter search
results = client.query.get(
    class_name="Document",
    properties=["title", "content", "category"]
).with_where({
    "path": ["category"],
    "operator": "Equal",
    "valueString": "technology"
}).with_near_text({
    "concepts": ["AI"]
}).with_limit(10).do()

# Filter with range
results = client.query.get(
    class_name="Document",
    properties=["title", "date"]
).with_where({
    "operator": "And",
    "operands": [
        {
            "path": ["category"],
            "operator": "Equal",
            "valueString": "technology"
        },
        {
            "path": ["date"],
            "operator": "GreaterThan",
            "valueDate": "2024-01-01T00:00:00Z"
        }
    ]
}).with_near_text({
    "concepts": ["AI"]
}).do()
typescript
// Semantic search
const results = await client.graphql
  .get()
  .withClassName('Document')
  .withFields('title content category _additional { distance }')
  .withNearText({
    concepts: ['artificial intelligence'],
    distance: 0.7
  })
  .withLimit(10)
  .do();

console.log(results.data.Get.Document);

// Hybrid search (BM25 + vector)
const results = await client.graphql
  .get()
  .withClassName('Document')
  .withFields('title content _additional { distance }')
  .withHybrid({
    query: 'artificial intelligence',
    alpha: 0.7, // 0 = pure BM25, 1 = pure vector
    vector: queryVector
  })
  .withLimit(10)
  .do();

// Filter search
const results = await client.graphql
  .get()
  .withClassName('Document')
  .withFields('title content category')
  .withWhere({
    path: ['category'],
    operator: 'Equal',
    valueText: 'technology'
  })
  .withNearText({
    concepts: ['AI']
  })
  .withLimit(10)
  .do();

// Filter with range
const results = await client.graphql
  .get()
  .withClassName('Document')
  .withFields('title date')
  .withWhere({
    operator: 'And',
    operands: [
      {
        path: ['category'],
        operator: 'Equal',
        valueText: 'technology'
      },
      {
        path: ['date'],
        operator: 'GreaterThan',
        valueDate: '2024-01-01T00:00:00Z'
      }
    ]
  })
  .withNearText({
    concepts: ['AI']
  })
  .do();

Embedding Strategies

Text Embeddings

python
# Using OpenAI embeddings
from openai import OpenAI

client = OpenAI(api_key="your-api-key")

def get_embedding(text: str) -> list:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

# Batch embeddings
def get_embeddings(texts: list) -> list:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=texts
    )
    return [item.embedding for item in response.data]

# Chunking for long texts
def chunk_text(text: str, chunk_size: int = 1000, overlap: int = 200) -> list:
    chunks = []
    for i in range(0, len(text), chunk_size - overlap):
        chunks.append(text[i:i + chunk_size])
    return chunks
typescript
// Using OpenAI embeddings
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'your-api-key'
});

async function getEmbedding(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text
  });
  return response.data[0].embedding;
}

// Batch embeddings
async function getEmbeddings(texts: string[]): Promise<number[][]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: texts
  });
  return response.data.map(item => item.embedding);
}

// Chunking for long texts
function chunkText(text: string, chunkSize: number = 1000, overlap: number = 200): string[] {
  const chunks: string[] = [];
  for (let i = 0; i < text.length; i += chunkSize - overlap) {
    chunks.push(text.slice(i, i + chunkSize));
  }
  return chunks;
}

Image Embeddings

python
# Using CLIP for image embeddings
from PIL import Image
import clip
import torch

# Load CLIP model
device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("ViT-B/32", device=device)

def get_image_embedding(image_path: str) -> list:
    image = preprocess(Image.open(image_path)).unsqueeze(0).to(device)
    with torch.no_grad():
        image_features = model.encode_image(image)
    return image_features.cpu().numpy().tolist()[0]

# Batch image embeddings
def get_image_embeddings(image_paths: list) -> list:
    images = torch.stack([preprocess(Image.open(path)) for path in image_paths]).to(device)
    with torch.no_grad():
        image_features = model.encode_image(images)
    return image_features.cpu().numpy().tolist()

Multimodal Embeddings

python
# Using OpenAI CLIP for text-image similarity
def get_text_embedding(text: str) -> list:
    text_tokens = clip.tokenize([text]).to(device)
    with torch.no_grad():
        text_features = model.encode_text(text_tokens)
    return text_features.cpu().numpy().tolist()[0]

def get_image_embedding(image_path: str) -> list:
    image = preprocess(Image.open(image_path)).unsqueeze(0).to(device)
    with torch.no_grad():
        image_features = model.encode_image(image)
    return image_features.cpu().numpy().tolist()[0]

# Compute similarity
import numpy as np

def cosine_similarity(vec1: list, vec2: list) -> float:
    v1 = np.array(vec1)
    v2 = np.array(vec2)
    return np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))

Similarity Search

Cosine Similarity

python
import numpy as np

def cosine_similarity(vec1: list, vec2: list) -> float:
    v1 = np.array(vec1)
    v2 = np.array(vec2)
    return np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))

# Example
vector_a = [1, 2, 3]
vector_b = [2, 4, 6]
similarity = cosine_similarity(vector_a, vector_b)
print(f"Cosine similarity: {similarity}")

Euclidean Distance

python
import numpy as np

def euclidean_distance(vec1: list, vec2: list) -> float:
    v1 = np.array(vec1)
    v2 = np.array(vec2)
    return np.linalg.norm(v1 - v2)

# Example
vector_a = [1, 2, 3]
vector_b = [2, 4, 6]
distance = euclidean_distance(vector_a, vector_b)
print(f"Euclidean distance: {distance}")

Dot Product

python
import numpy as np

def dot_product(vec1: list, vec2: list) -> float:
    v1 = np.array(vec1)
    v2 = np.array(vec2)
    return np.dot(v1, v2)

# Example
vector_a = [1, 2, 3]
vector_b = [2, 4, 6]
product = dot_product(vector_a, vector_b)
print(f"Dot product: {product}")

Performance Optimization

Batch Operations

python
# Pinecone batch upsert
def upsert_in_batches(vectors, batch_size=100):
    for i in range(0, len(vectors), batch_size):
        batch = vectors[i:i + batch_size]
        index.upsert(vectors=batch)

# Qdrant batch insert
from qdrant_client.models import Batch

def insert_in_batches(points, batch_size=100):
    for i in range(0, len(points), batch_size):
        batch = points[i:i + batch_size]
        client.upsert(
            collection_name="documents",
            points=Batch(
                ids=[p.id for p in batch],
                vectors=[p.vector for p in batch],
                payloads=[p.payload for p in batch]
            )
        )

Indexing Strategies

python
# Pinecone: Choose appropriate index type
# For smaller datasets: p1 pods
# For larger datasets: p2 pods
# For production: s1 pods (SSD)

# Qdrant: Configure HNSW parameters
client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(
        size=1536,
        distance=Distance.COSINE,
        hnsw_config={
            "m": 16,  # Number of connections per node
            "ef_construct": 100  # Index build speed
        }
    )
)

Caching

python
# Cache embeddings
import hashlib
import pickle
from functools import lru_cache

def get_embedding_cache_key(text: str) -> str:
    return hashlib.md5(text.encode()).hexdigest()

@lru_cache(maxsize=1000)
def get_cached_embedding(text: str) -> list:
    cache_key = get_embedding_cache_key(text)
    # Check cache
    # If not in cache, compute and store
    return get_embedding(text)

Production Considerations

Scaling

python
# Pinecone: Scale index
# Increase replica count for higher throughput
# Use larger pod types for more storage

# Qdrant: Sharding
client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
    shard_number=4  # Number of shards
)

# Weaviate: Multi-node setup
# Configure replication factor

Monitoring

python
# Pinecone: Monitor index stats
stats = index.describe_index_stats()
print(f"Total vectors: {stats['total_vector_count']}")
print(f"Dimension: {stats['dimension']}")

# Qdrant: Monitor collection info
info = client.get_collection("documents")
print(f"Vectors count: {info.vectors_count}")
print(f"Points count: {info.points_count}")

# Weaviate: Monitor cluster
cluster_status = client.cluster.get_nodes()
print(cluster_status)

Backup and Recovery

python
# Pinecone: Export data
# Use Pinecone's export functionality

# Qdrant: Snapshot
client.create_snapshot(collection_name="documents")

# Weaviate: Backup
# Use Weaviate's backup tools

Cost Optimization

Choosing Right Service

  • Pinecone: Managed service, easy setup, good for production
  • Qdrant: Open-source, self-hosted option, flexible
  • Weaviate: Open-source, GraphQL API, good for multimodal

Storage Optimization

python
# Use smaller embedding models
# text-embedding-3-small (1536 dims) vs text-embedding-3-large (3072 dims)

# Compress vectors
# Use quantization or dimensionality reduction

# Delete old data
# Implement retention policies

Query Optimization

python
# Use filters to reduce search space
# Limit top_k results
# Use appropriate distance metrics

Best Practices

  1. Choose Appropriate Embedding Model

    • For text: OpenAI text-embedding-3-small or ada-002
    • For images: CLIP, DINO, or domain-specific models
    • For multimodal: CLIP or similar models
  2. Preprocess Data

    • Clean text by removing special characters
    • Normalize whitespace
    • Convert to lowercase for consistency
  3. Use Appropriate Chunking

    • Chunk long documents
    • Use semantic chunking
    • Maintain context between chunks
  4. Implement Caching

    • Cache embeddings to reduce API calls
    • Cache query results
    • Use Redis for caching
  5. Monitor Performance

    • Track query latency
    • Monitor storage usage
    • Set up alerts for anomalies
  6. Use Filters Effectively

    • Use metadata filters to reduce search space
    • Combine vector search with keyword search
    • Use hybrid search when appropriate
  7. Handle Errors Gracefully

    • Implement retry logic
    • Handle rate limits
    • Log errors for debugging
  8. Test Thoroughly

    • Test with real data
    • Evaluate search quality
    • Benchmark performance
  9. Security

    • Use authentication in production
    • Encrypt sensitive data
    • Follow principle of least privilege
  10. Scalability

    • Design for horizontal scaling
    • Use appropriate sharding strategies
    • Monitor resource usage

Related Skills

  • 06-ai-ml-production/embedding-models
  • 06-ai-ml-production/rag-implementation
  • 06-ai-ml-production/vector-search
  • 04-database/database-optimization
  • 07-document-processing/rag-chunking-metadata-strategy

Expand your agent's capabilities with these related and highly-rated skills.

Didn't find tool you were looking for?

Be as detailed as possible for better results