Agent skill

Cassandra

Use Cassandra for distributed NoSQL database with high availability, linear scalability, and eventual consistency.

Stars 10
Forks 1

Install this agent skill to your Project

npx add-skill https://github.com/hivellm/rulebook/tree/main/templates/skills/services/cassandra

SKILL.md

Apache Cassandra Database Instructions

CRITICAL: Use Cassandra for distributed NoSQL database with high availability, linear scalability, and eventual consistency.

Core Features

Connection

typescript
// Using cassandra-driver
import { Client } from 'cassandra-driver'

const client = new Client({
  contactPoints: (process.env.CASSANDRA_HOSTS || 'localhost').split(','),
  localDataCenter: process.env.CASSANDRA_DATACENTER || 'datacenter1',
  keyspace: process.env.CASSANDRA_KEYSPACE || 'myapp',
  credentials: {
    username: process.env.CASSANDRA_USER || 'cassandra',
    password: process.env.CASSANDRA_PASSWORD || 'cassandra',
  },
  queryOptions: {
    consistency: 1, // ONE, QUORUM, ALL
    prepare: true,
  },
})

Basic Operations

typescript
// Create keyspace
await client.execute(`
  CREATE KEYSPACE IF NOT EXISTS myapp
  WITH REPLICATION = {
    'class': 'SimpleStrategy',
    'replication_factor': 1
  }
`)

// Use keyspace
await client.execute('USE myapp')

// Create table
await client.execute(`
  CREATE TABLE IF NOT EXISTS users (
    id UUID PRIMARY KEY,
    email TEXT,
    name TEXT,
    created_at TIMESTAMP
  )
`)

// Create index
await client.execute('CREATE INDEX IF NOT EXISTS ON users (email)')

// Insert
await client.execute(
  'INSERT INTO users (id, email, name, created_at) VALUES (?, ?, ?, ?)',
  [cassandra.types.Uuid.random(), 'john@example.com', 'John Doe', new Date()],
  { prepare: true }
)

// Select
const result = await client.execute(
  'SELECT * FROM users WHERE id = ?',
  [userId],
  { prepare: true }
)
const user = result.first()

// Update
await client.execute(
  'UPDATE users SET name = ? WHERE id = ?',
  ['Jane Doe', userId],
  { prepare: true }
)

// Delete
await client.execute(
  'DELETE FROM users WHERE id = ?',
  [userId],
  { prepare: true }
)

Advanced Features

typescript
// Batch operations
const queries = [
  {
    query: 'INSERT INTO users (id, email, name) VALUES (?, ?, ?)',
    params: [id1, 'user1@example.com', 'User 1'],
  },
  {
    query: 'INSERT INTO users (id, email, name) VALUES (?, ?, ?)',
    params: [id2, 'user2@example.com', 'User 2'],
  },
]

await client.batch(queries, { prepare: true })

// Collections
await client.execute(`
  CREATE TABLE IF NOT EXISTS products (
    id UUID PRIMARY KEY,
    name TEXT,
    tags SET<TEXT>,
    metadata MAP<TEXT, TEXT>
  )
`)

await client.execute(
  'UPDATE products SET tags = tags + ? WHERE id = ?',
  [['electronics', 'gadgets'], productId],
  { prepare: true }
)

// Time-to-Live (TTL)
await client.execute(
  'INSERT INTO sessions (id, data) VALUES (?, ?) USING TTL 3600',
  [sessionId, sessionData],
  { prepare: true }
)

Common Patterns

Data Modeling

typescript
// Design tables for query patterns
// Query: Get users by email
await client.execute(`
  CREATE TABLE users_by_email (
    email TEXT PRIMARY KEY,
    id UUID,
    name TEXT,
    created_at TIMESTAMP
  )
`)

// Query: Get posts by user and date
await client.execute(`
  CREATE TABLE posts_by_user (
    user_id UUID,
    created_at TIMESTAMP,
    post_id UUID,
    title TEXT,
    content TEXT,
    PRIMARY KEY (user_id, created_at, post_id)
  ) WITH CLUSTERING ORDER BY (created_at DESC)
`)

Consistency Levels

typescript
// Read with QUORUM consistency
const result = await client.execute(
  'SELECT * FROM users WHERE id = ?',
  [userId],
  {
    consistency: 2, // QUORUM
    prepare: true,
  }
)

// Write with ALL consistency (strongest)
await client.execute(
  'INSERT INTO users (id, email, name) VALUES (?, ?, ?)',
  [id, email, name],
  {
    consistency: 3, // ALL
    prepare: true,
  }
)

Best Practices

DO:

  • Design tables for query patterns (denormalize)
  • Use appropriate partition keys
  • Use clustering keys for sorting
  • Create secondary indexes sparingly
  • Use prepared statements
  • Set appropriate consistency levels
  • Use TTL for time-based data
  • Monitor cluster health
  • Use batch operations carefully
  • Implement retry logic

DON'T:

  • Use secondary indexes on high-cardinality columns
  • Create too many secondary indexes
  • Use ALL consistency for all operations
  • Store large values (> 1MB)
  • Skip error handling
  • Ignore cluster topology
  • Hardcode contact points
  • Use SELECT * in production
  • Ignore data modeling best practices
  • Skip monitoring

Configuration

Environment Variables

bash
CASSANDRA_HOSTS=localhost
CASSANDRA_HOSTS=node1:9042,node2:9042,node3:9042
CASSANDRA_DATACENTER=datacenter1
CASSANDRA_KEYSPACE=myapp
CASSANDRA_USER=cassandra
CASSANDRA_PASSWORD=securepassword

Docker Compose

yaml
services:
  cassandra:
    image: cassandra:4
    ports:
      - "9042:9042"
    environment:
      CASSANDRA_CLUSTER_NAME: my-cluster
      CASSANDRA_DC: datacenter1
      CASSANDRA_RACK: rack1
      CASSANDRA_ENDPOINT_SNITCH: GossipingPropertyFileSnitch
    volumes:
      - cassandra_data:/var/lib/cassandra
    healthcheck:
      test: ["CMD-SHELL", "nodetool status | grep -E '^UN' || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 5

volumes:
  cassandra_data:

Didn't find tool you were looking for?

Be as detailed as possible for better results