Agent skill
deployment-procedures
Production deployment principles and decision-making. Safe deployment workflows, rollback strategies, and verification. Teaches thinking, not scripts.
Install this agent skill to your Project
npx add-skill https://github.com/davila7/claude-code-templates/tree/main/cli-tool/components/skills/development/deployment-procedures
SKILL.md
Deployment Procedures
Deployment principles and decision-making for safe production releases. Learn to THINK, not memorize scripts.
⚠️ How to Use This Skill
This skill teaches deployment principles, not bash scripts to copy.
- Every deployment is unique
- Understand the WHY behind each step
- Adapt procedures to your platform
1. Platform Selection
Decision Tree
What are you deploying?
│
├── Static site / JAMstack
│ └── Vercel, Netlify, Cloudflare Pages
│
├── Simple web app
│ ├── Managed → Railway, Render, Fly.io
│ └── Control → VPS + PM2/Docker
│
├── Microservices
│ └── Container orchestration
│
└── Serverless
└── Edge functions, Lambda
Each Platform Has Different Procedures
| Platform | Deployment Method |
|---|---|
| Vercel/Netlify | Git push, auto-deploy |
| Railway/Render | Git push or CLI |
| VPS + PM2 | SSH + manual steps |
| Docker | Image push + orchestration |
| Kubernetes | kubectl apply |
2. Pre-Deployment Principles
The 4 Verification Categories
| Category | What to Check |
|---|---|
| Code Quality | Tests passing, linting clean, reviewed |
| Build | Production build works, no warnings |
| Environment | Env vars set, secrets current |
| Safety | Backup done, rollback plan ready |
Pre-Deployment Checklist
- All tests passing
- Code reviewed and approved
- Production build successful
- Environment variables verified
- Database migrations ready (if any)
- Rollback plan documented
- Team notified
- Monitoring ready
3. Deployment Workflow Principles
The 5-Phase Process
1. PREPARE
└── Verify code, build, env vars
2. BACKUP
└── Save current state before changing
3. DEPLOY
└── Execute with monitoring open
4. VERIFY
└── Health check, logs, key flows
5. CONFIRM or ROLLBACK
└── All good? Confirm. Issues? Rollback.
Phase Principles
| Phase | Principle |
|---|---|
| Prepare | Never deploy untested code |
| Backup | Can't rollback without backup |
| Deploy | Watch it happen, don't walk away |
| Verify | Trust but verify |
| Confirm | Have rollback trigger ready |
4. Post-Deployment Verification
What to Verify
| Check | Why |
|---|---|
| Health endpoint | Service is running |
| Error logs | No new errors |
| Key user flows | Critical features work |
| Performance | Response times acceptable |
Verification Window
- First 5 minutes: Active monitoring
- 15 minutes: Confirm stable
- 1 hour: Final verification
- Next day: Review metrics
5. Rollback Principles
When to Rollback
| Symptom | Action |
|---|---|
| Service down | Rollback immediately |
| Critical errors | Rollback |
| Performance >50% degraded | Consider rollback |
| Minor issues | Fix forward if quick |
Rollback Strategy by Platform
| Platform | Rollback Method |
|---|---|
| Vercel/Netlify | Redeploy previous commit |
| Railway/Render | Rollback in dashboard |
| VPS + PM2 | Restore backup, restart |
| Docker | Previous image tag |
| K8s | kubectl rollout undo |
Rollback Principles
- Speed over perfection: Rollback first, debug later
- Don't compound errors: One rollback, not multiple changes
- Communicate: Tell team what happened
- Post-mortem: Understand why after stable
6. Zero-Downtime Deployment
Strategies
| Strategy | How It Works |
|---|---|
| Rolling | Replace instances one by one |
| Blue-Green | Switch traffic between environments |
| Canary | Gradual traffic shift |
Selection Principles
| Scenario | Strategy |
|---|---|
| Standard release | Rolling |
| High-risk change | Blue-green (easy rollback) |
| Need validation | Canary (test with real traffic) |
7. Emergency Procedures
Service Down Priority
- Assess: What's the symptom?
- Quick fix: Restart if unclear
- Rollback: If restart doesn't help
- Investigate: After stable
Investigation Order
| Check | Common Issues |
|---|---|
| Logs | Errors, exceptions |
| Resources | Disk full, memory |
| Network | DNS, firewall |
| Dependencies | Database, APIs |
8. Anti-Patterns
| ❌ Don't | ✅ Do |
|---|---|
| Deploy on Friday | Deploy early in week |
| Rush deployment | Follow the process |
| Skip staging | Always test first |
| Deploy without backup | Backup before deploy |
| Walk away after deploy | Monitor for 15+ min |
| Multiple changes at once | One change at a time |
9. Decision Checklist
Before deploying:
- Platform-appropriate procedure?
- Backup strategy ready?
- Rollback plan documented?
- Monitoring configured?
- Team notified?
- Time to monitor after?
10. Best Practices
- Small, frequent deploys over big releases
- Feature flags for risky changes
- Automate repetitive steps
- Document every deployment
- Review what went wrong after issues
- Test rollback before you need it
Remember: Every deployment is a risk. Minimize risk through preparation, not speed.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
verl-rl-training
Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.
openrlhf-training
High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models (7B-70B+). Built on Ray, vLLM, ZeRO-3. 2× faster than DeepSpeedChat with distributed architecture and GPU resource sharing.
gguf-quantization
GGUF format and llama.cpp quantization for efficient CPU/GPU inference. Use when deploying models on consumer hardware, Apple Silicon, or when needing flexible quantization from 2-8 bit without GPU requirements.
Claude Code Guide
Master guide for using Claude Code effectively. Includes configuration templates, prompting strategies "Thinking" keywords, debugging techniques, and best practices for interacting with the agent.
qdrant-vector-search
High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or scalable vector storage with Rust-powered performance.
behavioral-modes
AI operational modes (brainstorm, implement, debug, review, teach, ship, orchestrate). Use to adapt behavior based on task type.
Didn't find tool you were looking for?