Agent skill

runbook-creation

Create operational runbooks, playbooks, standard operating procedures (SOPs), and incident response guides. Use when documenting operational procedures, on-call guides, or incident response processes.

Stars 151
Forks 20

Install this agent skill to your Project

npx add-skill https://github.com/aj-geddes/useful-ai-prompts/tree/main/skills/runbook-creation

SKILL.md

Runbook Creation

Table of Contents

  • Overview
  • When to Use
  • Quick Start
  • Reference Guides
  • Best Practices

Overview

Create comprehensive operational runbooks that provide step-by-step procedures for common operational tasks, incident response, and system maintenance.

When to Use

  • Incident response procedures
  • Standard operating procedures (SOPs)
  • On-call playbooks
  • System maintenance guides
  • Disaster recovery procedures
  • Deployment runbooks
  • Escalation procedures
  • Service restoration guides

Quick Start

Minimal working example:

markdown
# Incident Response Runbook

## Quick Reference

**Severity Levels:**

- P0 (Critical): Complete outage, data loss, security breach
- P1 (High): Major feature down, significant user impact
- P2 (Medium): Minor feature degradation, limited user impact
- P3 (Low): Cosmetic issues, minimal user impact

**Response Times:**

- P0: Immediate (24/7)
- P1: 15 minutes (business hours), 1 hour (after hours)
- P2: 4 hours (business hours)
- P3: Next business day

**Escalation Contacts:**

- On-call Engineer: PagerDuty rotation
- Engineering Manager: +1-555-0100
- VP Engineering: +1-555-0101
- CTO: +1-555-0102

// ... (see reference guides for full implementation)
```

## Reference Guides

Detailed implementations in the `references/` directory:

| Guide | Contents |
|---|---|
| [Symptoms](references/symptoms.md) | Symptoms, Severity: P0 (Critical), Initial Response (5 minutes) |
| [Investigation Steps](references/investigation-steps.md) | Investigation Steps |
| [Resolution Steps](references/resolution-steps.md) | Resolution Steps |
| [Verification](references/verification.md) | Verification |
| [Communication](references/communication.md) | Communication |
| [Post-Incident](references/post-incident.md) | Post-Incident |

## Best Practices

### ✅ DO

- Include quick reference section at top
- Provide exact commands to run
- Document expected outputs
- Include verification steps
- Add communication templates
- Define severity levels clearly
- Document escalation paths
- Include useful links and contacts
- Keep runbooks up-to-date
- Test runbooks regularly
- Include screenshots/diagrams
- Document common gotchas

### ❌ DON'T

- Use vague instructions
- Skip verification steps
- Forget to document prerequisites
- Assume knowledge of tools
- Skip communication guidelines
- Forget to update after incidents

Expand your agent's capabilities with these related and highly-rated skills.

aj-geddes/useful-ai-prompts

websocket-implementation

Implement real-time bidirectional communication with WebSockets including connection management, message routing, and scaling. Use when building real-time features, chat systems, live notifications, or collaborative applications.

151 20
Explore
aj-geddes/useful-ai-prompts

refactor-legacy-code

Modernize and improve legacy codebases while maintaining functionality. Use when you need to refactor old code, reduce technical debt, modernize deprecated patterns, or improve code maintainability without breaking existing behavior.

151 20
Explore
aj-geddes/useful-ai-prompts

Sentiment Analysis

Classify text sentiment using NLP techniques, lexicon-based analysis, and machine learning for opinion mining, brand monitoring, and customer feedback analysis

151 20
Explore
aj-geddes/useful-ai-prompts

flask-api-development

Develop lightweight Flask APIs with routing, blueprints, database integration, authentication, and request/response handling. Use when building RESTful APIs, microservices, or lightweight web services with Flask.

151 20
Explore
aj-geddes/useful-ai-prompts

ML Model Explanation

Interpret machine learning models using SHAP, LIME, feature importance, partial dependence, and attention visualization for explainability

151 20
Explore
aj-geddes/useful-ai-prompts

Statistical Hypothesis Testing

Conduct statistical tests including t-tests, chi-square, ANOVA, and p-value analysis for statistical significance, hypothesis validation, and A/B testing

151 20
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results