Agent skill

guardian

Proactively audit Cloudflare configurations for security vulnerabilities, resilience gaps, cost traps, AND budget/privacy risks. Use this skill when reviewing wrangler configs, before deployments, investigating issues, or when ANY architecture decision involves Durable Objects, R2, Workers AI, or high-volume operations. This skill PROACTIVELY warns about cost impacts before users ask.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/guardian

SKILL.md

Cloudflare Guardian Skill

Audit wrangler configurations for security vulnerabilities, performance issues, cost traps, resilience gaps, and proactively enforce budget/privacy constraints. Acts as a senior SRE and FinOps engineer reviewing infrastructure-as-code.

Cost Watchlist Reference

IMPORTANT: For detailed cost trap documentation, reference ${CLAUDE_PLUGIN_ROOT}/COST_SENSITIVE_RESOURCES.md.

When issuing cost warnings, use provenance tags:

  • [STATIC:COST_WATCHLIST] - Pattern detected via code analysis
  • [LIVE-VALIDATED:COST_WATCHLIST] - Confirmed by observability data
  • [REFUTED:COST_WATCHLIST] - Pattern exists but not hitting thresholds

Budget Whisperer Behavior

CRITICAL: When Claude suggests ANY code change involving the following, the guardian skill MUST trigger proactive checks:

D1 Write Operations

If suggesting code that includes .run(), .first(), or database writes:

  1. Search for .batch() - If missing, warn about per-row insert costs
  2. Search for CREATE INDEX - If querying unindexed columns, warn about scan costs
  3. Cite: TRAP-D1-001 or TRAP-D1-002 from COST_SENSITIVE_RESOURCES.md
Budget Whisperer Check:
- Detected: D1 write operation in proposed code
- Searched for: db.batch() usage
- Found: ❌ Missing batch operations
- Warning: [STATIC:COST_WATCHLIST] TRAP-D1-001
  Per-row INSERT detected. At 10K rows, this costs $0.01 vs $0.00001 batched.
  Recommendation: Wrap in db.batch() with max 1000 statements per batch.

D1 Read Operations (Row Read Explosion)

CRITICAL: D1's primary danger for solo devs is Row Reads, not writes. Unoptimized queries can trigger millions of reads per page view. One user hit the 5 million daily limit just by browsing their own site.

If suggesting code that includes SELECT, .all(), .first(), or database reads:

  1. Check for WHERE clause - SELECT without WHERE causes full table scans
  2. Verify indexed columns - WHERE on non-indexed column = full scan
  3. Check for LIMIT - Unbounded queries can read entire tables
  4. Consider caching - High-traffic reads should use KV cache first
  5. Cite: TRAP-D1-004 from COST_SENSITIVE_RESOURCES.md
Row Read Explosion Check:
- Detected: D1 SELECT query in proposed code
- Pattern: SELECT * FROM users WHERE status = 'active'
- Index Check: ❌ No index on 'status' column
- Warning: [STATIC:COST_WATCHLIST] TRAP-D1-004
  Unindexed query detected. With 100K rows, each request reads 100K rows.
  At 1K requests/day = 100M rows/day = $25/day (Free tier: 5B/month = 166M/day)
  Recommendation:
  1. CREATE INDEX idx_users_status ON users(status)
  2. Add LIMIT clause: LIMIT 100
  3. For high-traffic endpoints, use KV-Cache-First pattern

KV-Cache-First Pattern Detection: For high-traffic endpoints (>100 requests/minute), always suggest:

typescript
// EXPENSIVE: Direct D1 read on every request
const users = await db.prepare('SELECT * FROM users WHERE status = ?').bind('active').all();
// 1000 req/min × 100K rows = 6B rows/hour = $1.50/hour

// OPTIMIZED: KV cache with D1 fallback
const cacheKey = `users:active:${page}`;
let users = await env.KV.get(cacheKey, 'json');
if (!users) {
  users = await db.prepare('SELECT * FROM users WHERE status = ? LIMIT 100').bind('active').all();
  await env.KV.put(cacheKey, JSON.stringify(users), { expirationTtl: 60 });
}
// KV reads: $0.50/M, cache hit = no D1 cost

R2 Write Operations (Class A)

If suggesting code that includes .put():

  1. Check loop context - Is .put() inside a loop or frequently called handler?
  2. Search for buffering - Any aggregation before write?
  3. Cite: TRAP-R2-001 from COST_SENSITIVE_RESOURCES.md

R2 Read Operations (Class B)

If suggesting code that includes .get() for public assets:

  1. Check caching - Is response cached at the edge?
  2. Check Cache-Control headers - Are objects served with proper cache headers?
  3. Cite: TRAP-R2-003 from COST_SENSITIVE_RESOURCES.md
R2 Class B Cost Check:
- Detected: Public R2 bucket serving assets
- Pattern: await bucket.get(key) in request handler
- Caching: ❌ No Cache-Control headers, no CDN cache
- Warning: [STATIC:COST_WATCHLIST] TRAP-R2-003
  Every request = $0.36/M Class B operation.
  At 10M requests/month = $3.60/month (Free tier: 10M)
  Recommendation:
  1. Add Cache-Control: public, max-age=31536000 for static assets
  2. Use Cache Rules or Cache API to cache at edge
  3. See: r2-cdn-cache pattern

R2 Infrequent Access (IA) Storage Trap

CRITICAL WARNING: If code or config uses R2 Infrequent Access storage class:

R2 IA Billing Trap Check:
- Detected: R2 Infrequent Access storage usage
- Warning: [STATIC:COST_WATCHLIST] TRAP-R2-004 (CRITICAL)
  ⚠️ R2 IA has MINIMUM BILLING UNITS.
  A single operation can incur $9.00 minimum charge.

  Explanation: Cloudflare rounds up to next billing unit.
  1 file retrieval = rounds to minimum = $9.00

  Safe Usage:
  ✅ True cold storage (backups, archives)
  ✅ Large files (>100MB) where retrieval cost is amortized
  ✅ Objects never accessed after upload

  Dangerous Usage:
  ❌ Any bucket with regular read operations
  ❌ Small files that might be accessed
  ❌ User-facing asset storage

Durable Objects Usage

If suggesting DO architecture:

  1. Check use case - Is coordination/locking actually needed?
  2. Suggest alternatives - KV for simple storage, D1 for relational
  3. Cite: TRAP-DO-001 from COST_SENSITIVE_RESOURCES.md

Vibecoder Proactive Safeguards

IMPORTANT: This skill should proactively warn users about cost and privacy impacts BEFORE they deploy or even ask about costs. When reviewing ANY architecture that includes the following, immediately surface budget/privacy alerts:

Budget Enforcement Triggers

Service/Pattern Threshold Proactive Warning
Durable Objects Any usage "DO charges ~$0.15/GB-month storage + $0.50/M requests. Consider KV for simple key-value."
R2 Class A ops >1M/month "R2 writes cost $4.50/M. Buffer writes or use presigned URLs for client uploads."
R2 Class B ops >10M/month "R2 reads cost $0.36/M. For public buckets, add Cache Rules to serve from CDN (free)."
R2 Infrequent Access Any usage "⚠️ R2 IA has minimum billing units. A single operation may incur $9.00 minimum charge."
D1 Writes >10M/month "D1 writes cost $1/M. Detected pattern suggests >$10/mo. Batch to ≤1,000 rows."
D1 Reads (unindexed) Any unindexed query "⚠️ Unindexed D1 query causes full table scan. 100K rows × 1K req = 100M reads/day."
Workers AI (>8B) Any usage "Large models (Llama 11B+) cost $0.68/M tokens. Use 8B or smaller for bulk."
Vectorize >1M vectors "Approaching 5M vector limit. Plan sharding strategy."
KV Writes >5M/month "KV writes cost $5/M (10× reads). Consider D1 or R2 for write-heavy."

Privacy Enforcement Triggers

Pattern Severity Proactive Warning
PII in logs CRITICAL "Detected potential PII logging. Use structured logging with redaction."
User data in KV keys HIGH "KV keys with user IDs may leak via Workers dashboard. Hash or encrypt."
AI prompts with PII HIGH "AI Gateway logs may contain user data. Enable prompt redaction."
R2 public buckets HIGH "R2 bucket appears public. Verify intentional or add authentication."
Analytics with user IDs MEDIUM "User IDs in Analytics Engine may persist. Use anonymized identifiers."

Audit Categories

Security Audit Rules

ID Name Severity Check
SEC001 Secrets in plaintext CRITICAL vars.* contains API_KEY, SECRET, PASSWORD, TOKEN patterns
SEC002 Missing route auth HIGH Routes without cf.access or auth middleware
SEC003 CORS wildcard MEDIUM cors.origins includes *
SEC004 Exposed admin routes HIGH /admin/* routes without auth
SEC005 Missing rate limiting MEDIUM No rate limit bindings for public APIs
SEC006 Debug mode enabled LOW ENVIRONMENT or DEBUG set to development/true

Performance Audit Rules

ID Name Severity Check
PERF001 Missing Smart Placement LOW placement.mode not set
PERF002 D1 without indexes MEDIUM D1 bindings but no CREATE INDEX in migrations
PERF003 Large bundled dependencies MEDIUM Bundle >10MB (check main entry)
PERF004 Missing observability LOW No observability config block
PERF005 Frequent cron LOW Cron more often than every 5 minutes

Cost Audit Rules

ID Name Severity Check
COST001 Queue retries high MEDIUM max_retries > 1 for potentially idempotent consumers
COST002 No cron batching LOW Multiple crons that could be combined
COST003 AI without caching MEDIUM AI bindings but no AI Gateway
COST004 Large model usage LOW Workers AI with >8B parameter models
COST005 Missing Analytics Engine INFO Using D1/KV for metrics instead of free AE

Resilience Audit Rules

ID Name Severity Check
RES001 Missing DLQ HIGH Queues without dead_letter_queue binding
RES002 No concurrency limit MEDIUM max_concurrency not set for queue consumers
RES003 Single region LOW No cf.smart_placement for latency-sensitive
RES004 Missing retry config MEDIUM Queue consumer without explicit retry config
RES005 No circuit breaker LOW External API calls without timeout/fallback

Budget Audit Rules (Proactive)

ID Name Severity Check
BUDGET001 Durable Objects usage INFO Any DO binding - proactively explain cost model
BUDGET002 R2 write-heavy pattern MEDIUM Frequent R2 Class A ops without buffering
BUDGET003 D1 per-row inserts HIGH Loop-based INSERTs instead of batch
BUDGET004 Large AI model MEDIUM Workers AI with >8B parameter model
BUDGET005 KV write-heavy MEDIUM >5M KV writes/month pattern
BUDGET006 Vectorize scaling INFO >1M vectors - warn about 5M limit
BUDGET007 D1 row read explosion CRITICAL SELECT without index/LIMIT on high-traffic endpoint
BUDGET008 R2 Class B without cache MEDIUM Public R2 bucket reads without CDN caching
BUDGET009 R2 Infrequent Access trap HIGH R2 IA storage with minimum charge risk

Privacy Audit Rules

ID Name Severity Check
PRIV001 PII in logs CRITICAL console.log with user data patterns
PRIV002 User IDs in KV keys HIGH KV key patterns containing user/email/phone
PRIV003 AI prompts PII HIGH AI bindings without redaction middleware
PRIV004 R2 public access HIGH R2 bucket without authentication
PRIV005 Analytics PII MEDIUM User identifiers in Analytics Engine writes

Loop-Sensitive Resource Audit Rules (Billing Safety)

CRITICAL: Infinite loops are direct billing multipliers. These rules detect patterns that could cause runaway costs.

ID Name Severity Check
LOOP001 Missing cpu_ms limit HIGH No limits.cpu_ms in wrangler config
LOOP002 D1 query in loop CRITICAL SQL operations inside for/while/forEach blocks
LOOP003 R2 write in loop HIGH .put() calls inside iteration without buffering
LOOP004 DO setInterval HIGH setInterval in Durable Object without termination
LOOP005 Worker self-fetch CRITICAL fetch() to same Worker URL without depth limit
LOOP006 Queue no DLQ HIGH Queue consumer without dead_letter_queue (retry loop)
LOOP007 Unbounded while CRITICAL while(true) or for(;;) without break condition
LOOP008 High retry count MEDIUM Queue max_retries > 2 (cost multiplier)

Audit Workflow

Step 1: Parse Wrangler Config

Support both TOML and JSONC formats:

1. Read wrangler.toml or wrangler.jsonc
2. Parse into structured format
3. Extract: name, bindings, routes, triggers, vars

Step 2: Run Security Checks

For each security rule:
1. Check if pattern exists in config
2. If violation found:
   - Record rule ID, severity, location
   - Generate specific recommendation
   - Include docs URL if available

Step 3: Run Performance Checks

For each performance rule:
1. Check config for anti-patterns
2. Cross-reference with migrations (for D1 index checks)
3. Record findings with optimization recommendations

Step 4: Run Cost Checks

For each cost rule:
1. Identify cost-amplifying patterns
2. Estimate impact if possible
3. Provide specific fixes

Step 5: Run Resilience Checks

For each resilience rule:
1. Check for missing failure handling
2. Identify single points of failure
3. Recommend redundancy patterns

Step 5b: Run Budget Enforcement Checks (Proactive)

For bindings that trigger budget warnings:
1. Detect Durable Objects → Explain cost model proactively
2. Detect R2 writes → Check for buffering patterns
3. Detect D1 writes → Check for batch vs per-row
4. Detect Workers AI → Check model size selection
5. Detect high-volume KV → Suggest alternatives

Key principle: Surface budget impacts BEFORE the user asks about costs.

Step 5c: Run Privacy Checks

For privacy-sensitive patterns:
1. Scan code for console.log with user data patterns
2. Check KV key naming for PII patterns
3. Verify AI prompts have redaction middleware
4. Check R2 bucket access controls
5. Review Analytics Engine write patterns

Step 5d: Run Loop-Sensitive Resource Audit (Billing Safety)

CRITICAL: These checks prevent "denial-of-wallet" attacks from runaway loops.

For loop-sensitive patterns:

1. Check wrangler config for `limits.cpu_ms`:
   - Missing → LOOP001 (HIGH)
   - Suggest: Add limits.cpu_ms based on use case

2. Scan code for D1 queries in loops:
   Pattern: for|while|forEach.*\.prepare\(|\.run\(|\.first\(
   - Found in loop → LOOP002 (CRITICAL)
   - Cite: TRAP-D1-001 (N+1 query cost explosion)
   - Fix: Use db.batch() or QueryBatcher pattern

3. Scan code for R2 writes in loops:
   Pattern: for|while|forEach.*\.put\(
   - Found in loop → LOOP003 (HIGH)
   - Cite: TRAP-R2-001 (Class A op explosion)
   - Fix: Buffer writes, use batch upload

4. Scan Durable Objects for setInterval:
   Pattern: setInterval\(
   - Found without clearInterval or alarm() → LOOP004 (HIGH)
   - Duration billing continues while interval runs
   - Fix: Use state.storage.setAlarm() for hibernation

5. Scan for Worker self-fetch patterns:
   Pattern: fetch\(.*request\.url|fetch\(.*self
   - Missing X-Recursion-Depth handling → LOOP005 (CRITICAL)
   - Reference: @skills/loop-breaker/SKILL.md

6. Check queue config for DLQ:
   - Consumer without dead_letter_queue → LOOP006 (HIGH)
   - Each retry = additional message cost

7. Scan for unbounded loops:
   Pattern: while\s*\(\s*true\s*\)|for\s*\(\s*;\s*;\s*\)
   - No break/return condition visible → LOOP007 (CRITICAL)

8. Check queue max_retries:
   - max_retries > 2 → LOOP008 (MEDIUM)
   - Cost impact: (retries + 1) × message cost

Loop-Sensitive Resource Audit Output

When loop issues are found, include in the audit output:

markdown
## Loop-Sensitive Resource Audit (Billing Safety)

> **CRITICAL**: Detected patterns that could cause runaway billing.

| ID | Location | Issue | Est. Cost Impact |
|----|----------|-------|------------------|
| LOOP002 | src/handlers/import.ts:45 | D1 query in forEach loop | $0.01 per 1K iterations |
| LOOP005 | src/webhook.ts:23 | fetch(request.url) without depth limit | Unbounded |

### LOOP002: D1 Query in Loop (CRITICAL)

**Location**: `src/handlers/import.ts:45`

**Pattern Detected**:
```typescript
for (const item of items) {
  await db.prepare('INSERT INTO items ...').run();  // N queries
}

Cost Analysis:

  • 10,000 items = 10,000 D1 write operations
  • Cost: 10,000 × $1/M = $0.01 per batch
  • At scale: 1M items/day = $1/day = $30/month

Fix: [STATIC:COST_WATCHLIST] TRAP-D1-001

typescript
// Use db.batch() - 1 operation per 1000 items
const statements = items.map(item =>
  db.prepare('INSERT INTO items ...').bind(item.name)
);
await db.batch(statements);

LOOP005: Worker Self-Fetch (CRITICAL)

Location: src/webhook.ts:23

Pattern Detected:

typescript
const response = await fetch(request.url, { ... });

Risk: Infinite recursion if this webhook triggers itself.

Fix: Add recursion depth middleware

typescript
import { recursionGuard } from './middleware/recursion-guard';
app.use('*', recursionGuard({ maxDepth: 3 }));

See: @skills/loop-breaker/SKILL.md


### Step 6: Calculate Score

score = 100 - (critical × 25) - (high × 15) - (medium × 5) - (low × 2)


Grades:
- 90-100: A (Production ready)
- 80-89: B (Minor issues)
- 70-79: C (Address before deployment)
- 60-69: D (Significant issues)
- <60: F (Critical problems)

## Output Format

```markdown
# Cloudflare Configuration Audit

**Score**: XX/100 (Grade: X)
**File**: wrangler.jsonc

## Proactive Budget & Privacy Alerts

> **Budget Impact Detected**: [List any BUDGET* findings with cost estimates]
> **Privacy Concern**: [List any PRIV* findings requiring attention]

## Summary

| Category | Critical | High | Medium | Low | Info |
|----------|----------|------|--------|-----|------|
| Security | X | X | X | X | - |
| Performance | X | X | X | X | - |
| Cost | X | X | X | X | - |
| Resilience | X | X | X | X | - |
| Budget | - | X | X | - | X |
| Privacy | X | X | X | - | - |

## Critical Issues (Must Fix)

### SEC001: Secrets in plaintext
- **Location**: `vars.API_KEY`
- **Issue**: Plaintext API key in configuration
- **Fix**: Use `wrangler secret put API_KEY`
- **Docs**: https://developers.cloudflare.com/workers/configuration/secrets/

## High Priority Issues

### RES001: Missing dead letter queue
- **Location**: `queues[0]` (harvest-queue)
- **Issue**: No DLQ for failed message inspection
- **Fix**: Add `dead_letter_queue = "harvest-dlq"`

## Medium Priority Issues

[List all medium issues]

## Low Priority Issues

[List all low issues]

## Recommendations

1. [ ] Move secrets to wrangler secret
2. [ ] Add DLQ for all production queues
3. [ ] Enable Smart Placement
4. [ ] Consider Analytics Engine for metrics

Migration Checks

When D1 bindings exist, also scan migration files:

sql
-- Good: Has index
CREATE INDEX idx_projects_source ON projects(source);

-- Bad: Missing index for common query pattern
SELECT * FROM projects WHERE source = ? ORDER BY created_at DESC;

Flag missing indexes for:

  • Columns in WHERE clauses
  • Columns in ORDER BY
  • Compound queries (need compound indexes)

Wrangler Config Patterns

Good Patterns to Recognize

jsonc
{
  // Smart Placement enabled
  "placement": { "mode": "smart" },

  // Observability configured
  "observability": { "logs": { "enabled": true } },

  // Queue with DLQ
  "queues": {
    "consumers": [{
      "queue": "my-queue",
      "dead_letter_queue": "my-dlq",
      "max_retries": 1,
      "max_concurrency": 10
    }]
  }
}

Bad Patterns to Flag

jsonc
{
  // Secrets in vars
  "vars": { "API_KEY": "sk-xxxxx" },

  // No DLQ
  "queues": { "consumers": [{ "queue": "my-queue" }] },

  // High retries
  "queues": { "consumers": [{ "max_retries": 10 }] }
}

Live Validation with Probes

When MCP tools are available (via --validate mode in /cf-audit), enhance static findings with live data.

Reference @skills/probes/SKILL.md for detailed query patterns.

Security Validation

  • Error rate analysis: High errors on specific paths may indicate attacks
  • Request patterns: Verify authentication is actually enforced
  • Resource exposure: Check KV/R2 for public access settings

Performance Validation

  • EXPLAIN QUERY PLAN: Verify D1 index usage
  • Latency percentiles: P50/P95/P99 analysis
  • CPU time analysis: Identify hotspots

Resilience Validation

  • Queue health: Check DLQ depth and retry rates
  • Error patterns: Identify cascading failures

Provenance Tagging

Tag findings based on data source:

  • [STATIC] - Inferred from code/config analysis only
  • [LIVE-VALIDATED] - Confirmed by observability data
  • [LIVE-REFUTED] - Code smell not observed in production
  • [INCOMPLETE] - MCP tools unavailable for verification

Pattern Recommendations

When issues are found, recommend applicable patterns from @skills/patterns/:

Finding Recommended Pattern
Per-row D1 inserts d1-batching
External API issues circuit-breaker
Monolithic Worker service-bindings

Tips

  • Run before every deployment
  • Use --validate for production-ready verification
  • Focus on CRITICAL and HIGH first
  • Use --fix suggestions to auto-generate patches
  • Compare scores over time to track improvements
  • [LIVE-REFUTED] findings may still be worth fixing proactively

Expand your agent's capabilities with these related and highly-rated skills.

Didn't find tool you were looking for?

Be as detailed as possible for better results