Agent skill
clade-rate-limits
Handle Anthropic rate limits — understand tiers, implement backoff, Use when working with rate-limits patterns. optimize throughput, and monitor usage. Trigger with "anthropic rate limit", "claude 429", "anthropic throttling", "anthropic usage limits", "claude tokens per minute".
Install this agent skill to your Project
npx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/main/plugins/saas-packs/claude-pack/skills/clade-rate-limits
SKILL.md
Anthropic Rate Limits
Overview
Anthropic enforces three types of limits: requests per minute (RPM), input tokens per minute (TPM), and output tokens per minute. Limits depend on your spend tier.
Rate Limit Tiers
| Tier | Qualification | RPM | Input TPM | Output TPM |
|---|---|---|---|---|
| Tier 1 | Free | 50 | 40,000 | 8,000 |
| Tier 2 | $40+ spend | 1,000 | 80,000 | 16,000 |
| Tier 3 | $200+ spend | 2,000 | 160,000 | 32,000 |
| Tier 4 | $400+ spend | 4,000 | 400,000 | 80,000 |
| Scale | Custom | Custom | Custom | Custom |
Check your tier: console.anthropic.com → Settings → Limits
Response Headers
Every API response includes rate limit headers:
claude-ratelimit-requests-limit: 1000
claude-ratelimit-requests-remaining: 998
claude-ratelimit-requests-reset: 2025-01-01T00:01:00Z
claude-ratelimit-tokens-limit: 80000
claude-ratelimit-tokens-remaining: 79500
claude-ratelimit-tokens-reset: 2025-01-01T00:01:00Z
retry-after: 5
Built-In SDK Retries
The SDK automatically retries 429 and 529 errors with exponential backoff:
import Anthropic from '@claude-ai/sdk';
const client = new Anthropic({
maxRetries: 3, // default: 2. Set to 0 to disable.
});
Custom Backoff
async function callWithBackoff(params: Anthropic.MessageCreateParams, maxRetries = 5) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await client.messages.create(params);
} catch (err) {
if (err instanceof Anthropic.RateLimitError) {
const retryAfter = Number(err.headers?.['retry-after'] || 2 ** attempt);
const jitter = Math.random() * 1000;
console.log(`Rate limited. Retry in ${retryAfter}s (attempt ${attempt + 1})`);
await new Promise(r => setTimeout(r, retryAfter * 1000 + jitter));
} else {
throw err;
}
}
}
throw new Error('Exceeded max retries');
}
Throughput Optimization
| Strategy | Impact |
|---|---|
| Use Message Batches API | Bypasses rate limits entirely (async, 24h SLA) |
| Use prompt caching | Cached tokens don't count toward input TPM |
| Use smaller models for simple tasks | Lower token counts = more requests per minute |
Pre-count tokens with countTokens |
Avoid wasted requests that will fail |
| Queue and batch requests | Smooth out bursts |
Token Counting
// Count before sending — avoid burning RPM on requests that'll fail
const count = await client.messages.countTokens({
model: 'claude-sonnet-4-20250514',
messages,
system: systemPrompt,
});
console.log(`This request will use ${count.input_tokens} input tokens`);
Python
import anthropic
import time
client = anthropic.Anthropic(max_retries=5)
# Or manual handling:
try:
message = client.messages.create(...)
except anthropic.RateLimitError as e:
retry_after = float(e.response.headers.get("retry-after", 5))
time.sleep(retry_after)
Output
- Rate limit tier identified from response headers
- SDK configured with appropriate
maxRetriessetting - Custom backoff implemented with jitter for high-throughput use cases
- Throughput optimized using batches, caching, or model selection
Error Handling
| Error | Cause | Solution |
|---|---|---|
| API Error | Check error type and status code | See clade-common-errors |
Examples
See Rate Limit Tiers table, Response Headers section, Built-In SDK Retries, Custom Backoff implementation, and Throughput Optimization strategies above.
Resources
- Rate Limits Docs
- Message Batches — no rate limits
- Token Counting
Next Steps
See clade-cost-tuning for cost optimization strategies.
Prerequisites
- Completed
clade-install-auth - Understanding of HTTP response headers
- Familiarity with exponential backoff patterns
Instructions
Step 1: Review the patterns below
Each section contains production-ready code examples. Copy and adapt them to your use case.
Step 2: Apply to your codebase
Integrate the patterns that match your requirements. Test each change individually.
Step 3: Verify
Run your test suite to confirm the integration works correctly.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
dockerfile-generator
Dockerfile Generator - Auto-activating skill for DevOps Basics. Triggers on: dockerfile generator, dockerfile generator Part of the DevOps Basics skill category.
branch-naming-helper
Branch Naming Helper - Auto-activating skill for DevOps Basics. Triggers on: branch naming helper, branch naming helper Part of the DevOps Basics skill category.
readme-generator
Readme Generator - Auto-activating skill for DevOps Basics. Triggers on: readme generator, readme generator Part of the DevOps Basics skill category.
makefile-generator
Makefile Generator - Auto-activating skill for DevOps Basics. Triggers on: makefile generator, makefile generator Part of the DevOps Basics skill category.
gitignore-generator
Gitignore Generator - Auto-activating skill for DevOps Basics. Triggers on: gitignore generator, gitignore generator Part of the DevOps Basics skill category.
pre-commit-hook-setup
Pre Commit Hook Setup - Auto-activating skill for DevOps Basics. Triggers on: pre commit hook setup, pre commit hook setup Part of the DevOps Basics skill category.
Didn't find tool you were looking for?