Agent skill

disaster-recovery-testing

Execute comprehensive disaster recovery tests, validate recovery procedures, and document lessons learned from DR exercises.

Stars 151
Forks 20

Install this agent skill to your Project

npx add-skill https://github.com/aj-geddes/useful-ai-prompts/tree/main/skills/disaster-recovery-testing

SKILL.md

Disaster Recovery Testing

Table of Contents

  • Overview
  • When to Use
  • Quick Start
  • Reference Guides
  • Best Practices

Overview

Implement systematic disaster recovery testing to validate recovery procedures, measure RTO/RPO, identify gaps, and ensure team readiness for actual incidents.

When to Use

  • Annual DR exercises
  • Infrastructure changes
  • New service deployments
  • Compliance requirements
  • Team training
  • Recovery procedure validation
  • Cross-region failover testing

Quick Start

Minimal working example:

yaml
# dr-test-plan.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: dr-test-procedures
  namespace: operations
data:
  dr-test-plan.md: |
    # Disaster Recovery Test Plan

    ## Test Objectives
    - Validate backup restoration procedures
    - Verify failover mechanisms
    - Test DNS failover
    - Validate data integrity post-recovery
    - Measure RTO and RPO
    - Train incident response team

    ## Pre-Test Checklist
    - [ ] Notify stakeholders
    - [ ] Schedule 4-6 hour window
    - [ ] Disable alerting to prevent noise
    - [ ] Backup production data
    - [ ] Ensure DR environment is isolated
    - [ ] Have rollback plan ready
// ... (see reference guides for full implementation)

Reference Guides

Detailed implementations in the references/ directory:

Guide Contents
DR Test Plan and Execution DR Test Plan and Execution
DR Test Script DR Test Script
DR Test Automation DR Test Automation

Best Practices

✅ DO

  • Schedule regular DR tests
  • Document procedures in advance
  • Test in isolated environments
  • Measure actual RTO/RPO
  • Involve all teams
  • Automate validation
  • Record findings
  • Update procedures based on results

❌ DON'T

  • Skip DR testing
  • Test during business hours
  • Test against production
  • Ignore test failures
  • Neglect post-test analysis
  • Forget to re-enable monitoring
  • Use stale backup processes
  • Test only once a year

Expand your agent's capabilities with these related and highly-rated skills.

aj-geddes/useful-ai-prompts

websocket-implementation

Implement real-time bidirectional communication with WebSockets including connection management, message routing, and scaling. Use when building real-time features, chat systems, live notifications, or collaborative applications.

151 20
Explore
aj-geddes/useful-ai-prompts

refactor-legacy-code

Modernize and improve legacy codebases while maintaining functionality. Use when you need to refactor old code, reduce technical debt, modernize deprecated patterns, or improve code maintainability without breaking existing behavior.

151 20
Explore
aj-geddes/useful-ai-prompts

Sentiment Analysis

Classify text sentiment using NLP techniques, lexicon-based analysis, and machine learning for opinion mining, brand monitoring, and customer feedback analysis

151 20
Explore
aj-geddes/useful-ai-prompts

flask-api-development

Develop lightweight Flask APIs with routing, blueprints, database integration, authentication, and request/response handling. Use when building RESTful APIs, microservices, or lightweight web services with Flask.

151 20
Explore
aj-geddes/useful-ai-prompts

ML Model Explanation

Interpret machine learning models using SHAP, LIME, feature importance, partial dependence, and attention visualization for explainability

151 20
Explore
aj-geddes/useful-ai-prompts

Statistical Hypothesis Testing

Conduct statistical tests including t-tests, chi-square, ANOVA, and p-value analysis for statistical significance, hypothesis validation, and A/B testing

151 20
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results