Agent skill

schema-design

Universal database schema design principles: normalization strategy, data types, primary keys, constraints, anti-patterns, and red flags. Apply when designing schemas, reviewing database architecture, or refactoring data models.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/schema-design

SKILL.md

Database Schema Design Principles

"Schema design debt compounds faster than code debt. Fix it now or pay 10x later."

Foundation Principles

Primary Keys

  • Every table MUST have a primary key
  • Prefer surrogate keys (auto-increment, UUIDv7) over composite natural keys
  • UUIDv7 for distributed systems, auto-increment BIGINT for monoliths

Foreign Keys

  • Use FK constraints unless specific reason not to (high-volume logging, sharded DBs)
  • ON DELETE: RESTRICT (safest), CASCADE (use sparingly), SET NULL (breaks audit)

Data Types

  • Choose smallest sufficient type (BIGINT vs INT = 4 bytes * rows)
  • Money: DECIMAL (never FLOAT/DOUBLE)
  • Dates without time: DATE not DATETIME
  • Small sets: ENUM not VARCHAR

Constraints

  • NOT NULL on required columns
  • UNIQUE on natural keys
  • CHECK for business rules
  • DEFAULT where appropriate

Quality Checklist

Structural Integrity

  • Every table has primary key
  • Foreign key constraints defined
  • Appropriate data types (smallest sufficient)
  • NOT NULL, UNIQUE, CHECK constraints

Anti-Pattern Scan

  • No EAV (entity-attribute-value) patterns
  • No god tables (> 50 columns)
  • No multi-valued fields (CSV in columns)
  • No DATETIME for date-only data

Performance

  • Indexes match query patterns
  • Foreign keys indexed
  • Composite index column order optimized

Decision Trees

"Should I denormalize this?"

Have evidence of query performance problem?
├─ NO → DON'T denormalize (premature optimization)
└─ YES → Tried indexes, query optimization, caching?
   ├─ NO → Try those first
   └─ YES → Read-heavy (> 100:1)?
      ├─ NO → Normalize, optimize queries
      └─ YES → Denormalize specific fields

"UUID or auto-increment?"

Distributed system (multiple write nodes)?
├─ YES → UUIDv7 (time-ordered, better than v4)
└─ NO → Exposed to users (issue-123)?
   ├─ YES → Auto-increment (better UX)
   └─ NO → Auto-increment (better performance)

"Soft or hard delete?"

GDPR "right to erasure" applies?
├─ YES → Hard delete or audit table
└─ NO → Need audit trail?
   ├─ YES → Audit table pattern (recommended)
   └─ NO → High deletion rate (> 20%)?
      ├─ YES → Hard delete
      └─ NO → Soft delete acceptable

References

Detailed patterns and examples:

  • references/anti-patterns.md — EAV, god tables, multi-valued fields, red flags
  • references/normalization.md — 1NF/2NF/3NF, when to denormalize, OLTP vs OLAP
  • references/advanced-patterns.md — Soft delete, temporal data, JSON columns
  • references/naming-conventions.md — Tables, columns, indexes, constraints
  • references/performance-patterns.md — Indexing strategy, partitioning, data types

Remember

"The best schema is one you can understand in 6 months and modify with confidence."

Design schemas that:

  1. Enforce integrity — Constraints, foreign keys, data types
  2. Optimize for common patterns — Indexes, denormalization where proven
  3. Enable evolution — Proper normalization, migration strategy
  4. Prevent known anti-patterns — No EAV, god tables, multi-valued fields

Expand your agent's capabilities with these related and highly-rated skills.

Didn't find tool you were looking for?

Be as detailed as possible for better results