Agent skill

indexing-strategy

Decide when and how to index Solana data vs direct RPC reads. Covers event design, backfill, storage, and performance. Use for data architecture decisions.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/indexing-strategy

SKILL.md

Indexing Strategy

Role framing: You are a data architect. Your goal is to choose an indexing approach that meets freshness and cost needs without overbuilding.

Initial Assessment

  • What data is needed (events, account states, historical candles)?
  • Freshness and latency requirements?
  • Query patterns (by owner, by mint, by time)?
  • Expected scale and retention?

Core Principles

  • Index only when RPC queries become too heavy or slow; start simple.
  • Emit structured events to simplify indexing; include versioning.
  • Backfill first, then stream; ensure idempotency.
  • Storage schema matches query needs; avoid over-normalizing hot paths.

Workflow

  1. Decide necessity
    • Try getProgramAccounts + caches first; move to indexer if slow or large.
  2. Event design
    • Add program logs/events with discriminators and key fields; avoid verbose logs.
  3. Choose stack
    • Options: custom listener + DB, Helius/webhooks to queue, GraphQL subgraph equivalents, or hosted indexers.
  4. Backfill
    • Use getSignaturesForAddress/getTransaction or snapshot; store cursor; verify counts.
  5. Live ingestion
    • Subscribe to logs or webhooks; ensure dedupe and ordering by slot + tx index.
  6. Query API
    • Expose REST/GraphQL tailored to frontend/bot needs; add caching.
  7. Monitoring
    • Lag metrics (slots behind), error rate, queue depth; alerts.

Templates / Playbooks

  • Event schema: event_name, version, keys..., values... with borsh or base64 payloads.
  • Backfill checkpoint table: slot, signature, processed flag.
  • Storage patterns: wide tables for hot paths; partition by day for history.

Common Failure Modes + Debugging

  • Missing key fields in events -> hard queries; add indexes or emit new version.
  • Backfill gaps from rate limits; implement retries and cursors.
  • Duplicate processing on reorgs; use slot+sig idempotency key.
  • Unbounded storage growth; set retention or cold storage.

Quality Bar / Validation

  • Clear rationale for indexing vs RPC; event design documented.
  • Backfill completed with verification counts; lag monitored.
  • APIs tested against target queries with latency targets met.

Output Format

Provide indexing decision, event schema, ingestion plan (backfill + live), storage/query design, and monitoring plan.

Examples

  • Simple: Small app uses RPC + caching; no indexer needed; document reasons.
  • Complex: High-volume protocol emits events; uses webhooks to queue -> worker -> Postgres; backfill from slot X; exposes GraphQL; monitors lag < 5 slots.

Expand your agent's capabilities with these related and highly-rated skills.

Didn't find tool you were looking for?

Be as detailed as possible for better results