Red/Blue Team Validator

"Find weaknesses before reality does."

Every proposition—whether a decision, strategy, architecture, or plan—has vulnerabilities. This skill systematically exposes them through iterative adversarial cycles. Red Team attacks with substantive, steel-manned challenges. Blue Team defends with mitigations and hardening. The cycle continues until convergence: a battle-tested proposition with documented defenses.

1. Purpose

Core Value Proposition

Static analysis misses what adversarial pressure reveals. Red/Blue validation simulates the attacks your proposition will face—from competitors, critics, reality itself—and forces you to build defenses before you need them. The output is not just a risk list, but a hardened proposition that has survived systematic assault.

Capabilities

#	Capability	Phase	Value
1	Proposition intake with attack surface mapping	Pre-Round	Define what can be attacked
2	Experience pool loading (domain failure patterns)	Pre-Round	Avoid reinventing known failures
3	Multi-category attack generation	Round N: Red	Surface vulnerabilities systematically
4	Steel-manning attacks to maximum strength	Round N: Red	Ensure attacks are not strawmen
5	Severity scoring (CRITICAL/HIGH/MEDIUM/LOW)	Round N: Red	Prioritize responses
6	Defense generation (REFUTE/MITIGATE/ACCEPT/HARDEN)	Round N: Blue	Address each attack
7	Proposition hardening through iterative refinement	Round N: Blue	Strengthen against attacks
8	Convergence evaluation with explicit criteria	Round N: Eval	Know when to stop
9	RISK-ASSESSMENT synthesis (CONTRACT-08)	Post-Round	Standardized output
10	Hardened proposition generation	Post-Round	Battle-tested version
11	Attack/defense log compilation	Post-Round	Audit trail
12	Go/no-go recommendation	Post-Round	Decision support

2. When to Use

Ideal Use Cases

Scenario	Why Red/Blue Validation Matters
Pre-commitment decision review	Simulate objections before committing resources
Strategy validation	War-game competitive responses and market realities
Architecture decision hardening	Stress-test technical choices before implementation
Proposal defense preparation	Anticipate and prepare for stakeholder pushback
Investment due diligence	Adversarial review of financial projections and market assumptions
Security posture assessment	Systematic attack surface enumeration
Go/no-go decisions	High-stakes decisions need adversarial pressure
Policy/process validation	Find edge cases and failure modes
Product launch readiness	Anticipate market, competitive, and operational challenges
M&A target evaluation	Adversarial review of synergy claims

Anti-Patterns (When NOT to Use)

Anti-Pattern	Why It's Ineffective	Better Alternative
Low-stakes decisions	Over-engineering for trivial choices	Just decide and iterate
Time-critical emergencies	Fires need extinguishing, not philosophy	Act, then debrief
Already committed	Adversarial review after commitment creates conflict	Use for future decisions
Early exploration	Premature to attack ideas still forming	Use after initial validation
Confirmation theater	Going through motions without genuine adversarial intent	Either commit to true adversarial thinking or skip
Reversible decisions	Two-way doors don't need siege testing	Save intensity for one-way doors

3. Parameters

Parameter	Type	Required	Default	Description
`subject_type`	enum	yes	—	decision \| strategy \| architecture \| plan \| policy \| investment \| security
`max_rounds`	integer	no	3	Maximum red/blue cycles (1-5)
`attack_intensity`	enum	no	standard	light \| standard \| aggressive
`attack_categories`	list	no	auto	Categories to probe (see catalog); auto selects by subject_type
`convergence_mode`	enum	no	no_new_critical	no_new_critical \| all_addressed \| round_limit
`include_experience_pool`	boolean	no	true	Load domain-specific failure patterns
`steel_manning_level`	enum	no	standard	minimal \| standard \| maximum
`output_mode`	enum	no	full	risk_assessment \| hardened_proposition \| full_log

Parameter Effects Matrix

Parameter	Red Phase Effect	Blue Phase Effect	Convergence Effect
`attack_intensity: light`	Top 3 attack categories	Quick defenses	max_rounds capped at 2
`attack_intensity: standard`	Top 5 attack categories	Full defense protocol	Normal convergence
`attack_intensity: aggressive`	All applicable categories	Exhaustive defense	Requires no_new_critical
`steel_manning_level: minimal`	1-pass attacks	—	Faster rounds
`steel_manning_level: standard`	2-pass steel-manning	—	Normal rounds
`steel_manning_level: maximum`	3-pass with ideological Turing test	—	Thorough rounds
`convergence_mode: no_new_critical`	—	—	Stop when 0 new CRITICAL/HIGH
`convergence_mode: all_addressed`	—	Must address all	Stop when no ACCEPT responses
`convergence_mode: round_limit`	—	—	Stop at max_rounds

Auto-Selected Attack Categories by Subject Type

Subject Type	Default Attack Categories
`decision`	ASSUMPTIONS, ALTERNATIVES, REVERSIBILITY, CONSEQUENCES, TIMING
`strategy`	COMPETITIVE, MARKET, EXECUTION, DEPENDENCIES, TIMELINE
`architecture`	SCALABILITY, SECURITY, DEPENDENCIES, OPERATIONAL, EDGE_CASES
`plan`	FEASIBILITY, RESOURCES, TIMELINE, DEPENDENCIES, RISKS
`policy`	EDGE_CASES, ENFORCEMENT, UNINTENDED_CONSEQUENCES, POLITICAL
`investment`	ECONOMIC, MARKET, EXECUTION, COMPETITIVE, ASSUMPTIONS
`security`	ATTACK_SURFACE, VULNERABILITIES, DEPENDENCIES, OPERATIONAL

4. Checkpoints

This skill uses interactive checkpoints (see references/checkpoints.yaml) to resolve ambiguity:

subject_type_classification — When proposition type is ambiguous
attack_intensity_selection — When attack intensity not specified
convergence_mode_selection — When convergence criteria not specified
premature_convergence_check — When convergence met but warning signs present
infinite_loop_risk — When defenses generate more attacks than they resolve
output_mode_selection — When output format not specified

5. Iterative Workflow

Workflow Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                       RED/BLUE TEAM VALIDATOR                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ╔══════════════════════════════════════════════════════════════════════╗   │
│  ║                      PRE-ROUND SETUP                                  ║   │
│  ║  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                   ║   │
│  ║  │ Proposition │  │   Attack    │  │ Experience  │                   ║   │
│  ║  │   Intake    │─▶│  Surface    │─▶│    Pool     │                   ║   │
│  ║  │             │  │  Mapping    │  │   Loading   │                   ║   │
│  ║  └─────────────┘  └─────────────┘  └─────────────┘                   ║   │
│  ╚══════════════════════════════════════════════════════════════════════╝   │
│                                  │                                           │
│                                  ▼                                           │
│  ╔══════════════════════════════════════════════════════════════════════╗   │
│  ║                         ROUND N                                       ║   │
│  ║  ┌─────────────┐                    ┌─────────────┐                  ║   │
│  ║  │  RED TEAM   │                    │  BLUE TEAM  │                  ║   │
│  ║  │   ATTACK    │───── Attacks ─────▶│   DEFENSE   │                  ║   │
│  ║  │ (Generate & │                    │ (Respond &  │                  ║   │
│  ║  │ Steel-man)  │                    │  Harden)    │                  ║   │
│  ║  └─────────────┘                    └─────────────┘                  ║   │
│  ║         │                                  │                          ║   │
│  ║         └────────────┬─────────────────────┘                          ║   │
│  ║                      ▼                                                ║   │
│  ║              ┌─────────────┐                                          ║   │
│  ║              │ EVALUATION  │                                          ║   │
│  ║              │ & Converge? │                                          ║   │
│  ║              └─────────────┘                                          ║   │
│  ║                      │                                                ║   │
│  ║           ┌──────────┴──────────┐                                     ║   │
│  ║           ▼                     ▼                                     ║   │
│  ║    [NOT CONVERGED]        [CONVERGED]                                 ║   │
│  ║    → Round N+1            → Exit loop                                 ║   │
│  ╚══════════════════════════════════════════════════════════════════════╝   │
│                                  │                                           │
│                                  ▼                                           │
│  ╔══════════════════════════════════════════════════════════════════════╗   │
│  ║                     POST-ROUND SYNTHESIS                              ║   │
│  ║  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                   ║   │
│  ║  │    RISK     │  │  Hardened   │  │  Attack/    │                   ║   │
│  ║  │ ASSESSMENT  │  │Proposition  │  │ Defense Log │                   ║   │
│  ║  │(CONTRACT-08)│  │  Output     │  │             │                   ║   │
│  ║  └─────────────┘  └─────────────┘  └─────────────┘                   ║   │
│  ╚══════════════════════════════════════════════════════════════════════╝   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Pre-Round Setup

Purpose: Prepare the battlefield—understand what's being tested and load relevant knowledge.

Steps:

Proposition Intake
- Receive subject (decision, strategy, architecture, plan, etc.)
- If verbal, request written summary or create one together
- Extract key claims and assertions to be defended
- Identify stakeholders and constraints
- Note: Proposition should be specific enough to attack
Attack Surface Mapping
- Identify dimensions available for attack (from attack-vector-catalog)
- Map proposition claims to attackable surfaces
- Select attack categories based on subject_type or explicit attack_categories
- See: references/attack-vector-catalog.md for categories
Experience Pool Loading (if include_experience_pool: true)
- Load domain-specific failure patterns
- Reference historical failures in similar contexts
- Prepare anti-patterns to probe
- See: references/experience-pool-patterns.md for patterns
Set Parameters
- Confirm attack intensity, convergence mode, steel-manning level
- Estimate expected rounds based on complexity
CHECKPOINT: subject_type_classification
- If subject_type not specified or ambiguous: AskUserQuestion
- Present subject type options with attack category implications
CHECKPOINT: attack_intensity_selection
- If attack_intensity not specified: AskUserQuestion
- Present intensity options with effort implications
CHECKPOINT: convergence_mode_selection
- If convergence_mode not specified: AskUserQuestion
- Present convergence options with trade-offs

Quality Gate: Attack Surface Mapped

Proposition boundaries explicitly defined
Attack categories selected (minimum 3)
Experience pool loaded (if enabled)
Parameters confirmed

Output: Attack-ready proposition with mapped attack surface

Round N: Red Team Phase

Purpose: Generate substantive, steel-manned attacks on the proposition.

Reference: See references/red-team-techniques.md and references/steel-manning-protocol.md.

Steps:

Attack Generation

For each attack category in scope, generate attacks:

Technique	When to Use	Expected Yield
Pre-mortem	Always	3-5 attacks
Inversion	Strategy, Decision	2-4 attacks
Competitor Simulation	Strategy, Investment	2-3 attacks
Stress Test Amplification	Architecture, Plan	2-4 attacks
Devil's Advocate	Policy, Decision	2-3 attacks
Blind Spot Hunter	All	1-3 attacks
Historical Pattern Matching	All (with experience pool)	2-4 attacks
Black Hat Thinking	Security, Competitive	3-5 attacks

See: references/red-team-techniques.md for protocols

Steel-Manning (per steel_manning_level)

For each attack, strengthen to maximum potency:

Level Passes Protocol

minimal 1 Basic attack formulation

standard 2 + "How can this be more damaging?"

maximum 3 + Ideological Turing test: "Would a true opponent accept this?"

Steel-manning checklist:
- Attack is specific, not vague
- Attack has clear mechanism of harm
- Attack includes realistic trigger conditions
- Attack would concern a reasonable proponent
- Attack is not easily dismissed
- See: references/steel-manning-protocol.md for full protocol

Level	Passes	Protocol
minimal	1	Basic attack formulation
standard	2	+ "How can this be more damaging?"
maximum	3	+ Ideological Turing test: "Would a true opponent accept this?"

Severity Scoring

Score each attack using SEVERITY-SCORING (RUBRIC-07):

Severity	Definition	Response Urgency
CRITICAL	Blocks primary objective; cannot proceed	Must address in Blue Phase
HIGH	Significant impact; major rework required	Should address in Blue Phase
MEDIUM	Degrades quality; should fix but can proceed	Address if time permits
LOW	Minor issue; cosmetic	Document and monitor

Scoring dimensions:

Impact (0.5 weight): How damaging if attack succeeds?
Likelihood (0.3 weight): How likely is this attack vector?
Detectability (0.2 weight): How hard to see this coming?

Attack Documentation

For each attack:

Attack ID: ATK-[round]-[number]
Category: [From attack-vector-catalog]
Target: [What aspect of proposition]
Statement: [Clear attack formulation]
Mechanism: [How this would cause harm]
Severity: [CRITICAL | HIGH | MEDIUM | LOW]
Steel-manning: [minimal | standard | maximum] - [notes]
Experience pool match: [Pattern ID if applicable]

Quality Gate: Attacks Substantive

Minimum 3 attacks generated
At least 2 different attack categories represented
Steel-manning applied per level
No trivial or easily dismissed attacks
Severities assigned with rationale

Output: Prioritized attack list for Blue Team

Round N: Blue Team Phase

Purpose: Respond to each attack with defenses, mitigations, or proposition hardening.

Reference: See references/blue-team-techniques.md for defense protocols.

Steps:

Attack Triage

Prioritize attacks by severity:
- CRITICAL: Must address this round
- HIGH: Should address this round
- MEDIUM: Address if time/capacity permits
- LOW: Document for monitoring

Defense Generation

For each attack, determine response type:

Response Type	When to Use	Effect
REFUTE	Attack is invalid; evidence proves it wrong	Attack dismissed
MITIGATE	Attack is valid; add safeguards	Risk reduced
ACCEPT	Attack is valid; insufficient mitigation possible	Residual risk documented
HARDEN	Modify proposition to eliminate vulnerability	Proposition strengthened

Defense techniques:

Technique	Response Type	When to Use
Evidence-Based Refutation	REFUTE	When data contradicts attack
Mitigation Design	MITIGATE	When attack is valid but manageable
Contingency Planning	MITIGATE	When fallback is needed
Monitoring/Detection	MITIGATE	When early warning helps
Hardening Protocol	HARDEN	When proposition can be strengthened
Risk Transfer	MITIGATE	When others can absorb risk
Staged Commitment	MITIGATE	When phasing reduces exposure
Kill Switch Design	MITIGATE	When reversibility is critical

See: references/blue-team-techniques.md for detailed protocols

Defense Documentation

For each defense:

Defense ID: DEF-[round]-[number]
Attack Addressed: ATK-[round]-[number]
Response Type: [REFUTE | MITIGATE | ACCEPT | HARDEN]
Defense: [Specific response]
Evidence/Rationale: [Why this defense works]
Residual Risk: [ELIMINATED | REDUCED | UNCHANGED]
Proposition Change: [If HARDEN, what changed]

Proposition Hardening

Apply all HARDEN responses to proposition:
- Document each modification
- Track changes between rounds
- Maintain hardened proposition version
Defense Quality Check

For each defense, verify:
- Defense actually addresses the attack (not adjacent issue)
- REFUTE claims have supporting evidence
- MITIGATE responses are actionable
- ACCEPT responses have residual risk documented
- HARDEN changes don't introduce new vulnerabilities

Quality Gate: Attacks Addressed

Every attack has a defense response
CRITICAL attacks have REFUTE or MITIGATE (not ACCEPT)
HIGH attacks have REFUTE, MITIGATE, or documented ACCEPT with rationale
Hardening changes documented
No hand-waving defenses

Output: Defense log with updated (hardened) proposition

Round N: Evaluation Phase

Purpose: Determine if another round is needed or convergence achieved.

Reference: See references/convergence-criteria.md for detailed criteria.

Steps:

Assess Round Quality

Red Team assessment:
- Were attacks substantive or rehashes of previous rounds?
- Are there novel attack angles remaining?
- Is Red Team finding diminishing returns?
Blue Team assessment:
- Were defenses genuine or hand-waving?
- Are mitigations actionable?
- Has proposition been strengthened?

Apply Convergence Criteria

Mode	Stop When	Continue When
`no_new_critical`	Round produced 0 new CRITICAL or HIGH attacks	New CRITICAL or HIGH attacks found
`all_addressed`	No ACCEPT responses remain (all REFUTE/MITIGATE/HARDEN)	Any ACCEPT responses remain
`round_limit`	`max_rounds` reached	Below `max_rounds`

Override conditions (continue despite convergence):

Obvious attack categories not yet explored
Blue Team defenses appear inadequate
Stakeholder requests additional scrutiny

Premature termination signs (don't stop too early):

Less than 2 rounds completed
CRITICAL attacks still have ACCEPT responses
Key attack categories unexplored

Document Convergence Decision

Round [N] Evaluation:
- New CRITICAL attacks: [count]
- New HIGH attacks: [count]
- ACCEPT responses remaining: [count]
- Convergence mode: [mode]
- Decision: [CONTINUE | CONVERGED]
- Rationale: [explanation]

Proceed or Exit
- If NOT CONVERGED: Increment round, return to Red Phase
- If CONVERGED: Proceed to Post-Round Synthesis
CHECKPOINT: premature_convergence_check
- If convergence met but warning signs present: AskUserQuestion
- Warning signs: <2 rounds, CRITICAL ACCEPTs remain, key categories unexplored
CHECKPOINT: infinite_loop_risk
- If new attacks from defenses exceed previous round: AskUserQuestion
- May indicate fundamental proposition issues

Quality Gate: Convergence Evaluated

Explicit continue/stop decision documented
Rationale provided
Override conditions checked
Premature termination signs checked

Output: Convergence decision with rationale

Post-Round Synthesis

Purpose: Compile findings into actionable outputs.

Reference: See templates/ for output formats.

CHECKPOINT: output_mode_selection

If output_mode not specified: AskUserQuestion
Options: risk_assessment, hardened_proposition, full_log

Steps:

Compile Attack/Defense Log

Consolidate all rounds:
- All attacks with responses
- Round-by-round progression
- Convergence trajectory
- See: templates/attack-defense-log.md

Derive RISK-ASSESSMENT (CONTRACT-08)

Transform unresolved attacks into risks:

Attack Status	Risk Derivation
ACCEPT response	Direct risk: attack remains valid
MITIGATE with residual	Risk: partially addressed
MITIGATE with ELIMINATED	No risk (resolved)
REFUTE	No risk (invalid attack)
HARDEN	No risk (vulnerability removed)

Score each derived risk using SEVERITY-SCORING:

Include mitigations from Blue Team responses
See: templates/risk-assessment-output.md

Generate Hardened Proposition

Compile final version:
- Original proposition + all HARDEN modifications
- List of accepted residual risks
- Battle-tested confidence score
- Conditions for validity
- Review triggers
- See: templates/hardened-proposition-output.md

Calculate Battle-Tested Confidence

Score based on:

Rounds completed (more = higher confidence)
Attack quality (substantive attacks survived)
Defense quality (genuine defenses, not hand-waving)
Residual risk profile (fewer ACCEPT = higher confidence)

Score	Meaning
80-100	High confidence: withstood aggressive scrutiny
60-79	Moderate confidence: key challenges addressed
40-59	Low confidence: significant risks remain
0-39	Very low confidence: fundamental issues unresolved

Generate Go/No-Go Recommendation

Recommendation	When
PROCEED	Low/very low residual risk; proposition battle-tested
PROCEED_WITH_CAUTION	Moderate risk; mitigations in place
SIGNIFICANT_CONCERNS	High risk; key attacks unresolved
DO_NOT_PROCEED	Very high risk; fundamental flaws exposed

Quality Gates:

RISK-ASSESSMENT complete with go/no-go
All attacks traced to risks or resolutions
Hardened proposition documented
Battle-tested confidence calculated
Attack/defense log compiled

Output: RISK-ASSESSMENT (CONTRACT-08), Hardened Proposition, Attack/Defense Log

5. Attack Vector Catalog

Ten categories of attacks, applicable across subject types:

5.1 ASSUMPTIONS

Definition: Attacks targeting hidden, unstated, or fragile assumptions.

Attack Pattern	Target	Example
Hidden assumption exposure	Unstated beliefs	"You're assuming customers want this feature"
Load-bearing challenge	Critical assumptions	"If this assumption fails, the whole plan collapses"
Temporal decay	Time-sensitive assumptions	"This assumption won't hold in 2 years"
Behavioral assumptions	Human behavior predictions	"You're assuming the team will change behavior"
Counterfactual reversal	Any assumption	"What if the opposite is true?"

Risk Level: HIGH (often invisible until failure)

5.2 DEPENDENCIES

Definition: Attacks targeting external or internal dependencies.

Attack Pattern	Target	Example
External dependency failure	Third parties	"What if the vendor goes out of business?"
Technology obsolescence	Tech dependencies	"This framework may not be maintained in 3 years"
Team capability dependency	People	"This requires skills the team doesn't have"
Resource availability	Resources	"What if the budget is cut 30%?"
Single point of failure	Critical dependencies	"Everything depends on this one system"

Risk Level: HIGH (external factors often uncontrollable)

5.3 EDGE_CASES

Definition: Attacks targeting boundary conditions and unusual scenarios.

Attack Pattern	Target	Example
Boundary conditions	Limits	"What happens at 0? At max capacity?"
Scale extremes	Very large/small	"Does this work with 1 user? 1 million?"
Timing edge cases	Timing	"What if these events happen simultaneously?"
Data quality	Inputs	"What if the input data is garbage?"
Concurrency/race conditions	Parallel operations	"What if two users do this at the same time?"

Risk Level: MEDIUM (often discoverable through testing)

5.4 SCALABILITY

Definition: Attacks targeting ability to grow or shrink.

Attack Pattern	Target	Example
Horizontal scaling limits	Adding instances	"Can you just add more servers?"
Vertical scaling limits	Bigger instances	"What if you need 10x the memory?"
Cost scaling non-linearity	Economics	"Costs grow O(n²) with users"
Operational complexity	Team capacity	"Can the team manage 50 services?"
Data volume scaling	Storage/processing	"What happens with 10TB of data?"

Risk Level: HIGH (often not discovered until growth happens)

5.5 SECURITY

Definition: Attacks targeting security posture and vulnerabilities.

Attack Pattern	Target	Example
Attack surface exposure	Entry points	"Every API is an attack vector"
Data breach scenarios	Data protection	"What if this database is compromised?"
Authentication gaps	Identity	"How do you prevent unauthorized access?"
Authorization gaps	Permissions	"Can users access others' data?"
Compliance violations	Regulations	"Does this violate GDPR?"

Risk Level: CRITICAL (security failures can be catastrophic)

5.6 COMPETITIVE

Definition: Attacks targeting competitive dynamics.

Attack Pattern	Target	Example
Competitor response	Competitive reaction	"What will [competitor] do when they see this?"
Market timing	Windows	"The market window may close before launch"
Differentiation erosion	Uniqueness	"This feature can be copied in weeks"
Pricing pressure	Economics	"Competitor can undercut by 50%"
Acquisition/partnership disruption	Strategic moves	"What if competitor acquires your key partner?"

Risk Level: HIGH (competitive dynamics are unpredictable)

5.7 OPERATIONAL

Definition: Attacks targeting day-to-day operations.

Attack Pattern	Target	Example
Complexity explosion	Manageability	"This will be impossible to debug"
Incident scenarios	Failure recovery	"What's the MTTR when this breaks at 3 AM?"
Recovery time	Resilience	"Can you recover within SLA?"
Monitoring gaps	Observability	"How would you even know it's failing?"
On-call burden	Team health	"This will burn out the team"

Risk Level: MEDIUM-HIGH (operational issues compound)

5.8 ECONOMIC

Definition: Attacks targeting financial viability.

Attack Pattern	Target	Example
Unit economics failure	Per-unit costs	"Each customer costs more than they pay"
Cost structure vulnerability	Fixed costs	"Break-even requires 10x current volume"
Revenue model fragility	Income sources	"Revenue depends on one customer segment"
Funding/cash flow	Capital	"You'll run out of runway in 8 months"
Market size overestimation	TAM/SAM/SOM	"Your market is 1/10th the claimed size"

Risk Level: HIGH (financial failure is existential)

5.9 ORGANIZATIONAL

Definition: Attacks targeting people and organization.

Attack Pattern	Target	Example
Capability gaps	Skills	"No one on the team has done this before"
Key person dependency	Individuals	"If [person] leaves, this fails"
Cultural resistance	Adoption	"The organization will reject this change"
Political opposition	Stakeholders	"[Executive] will block this"
Change management	Transition	"Users will refuse to migrate"

Risk Level: MEDIUM-HIGH (organizational dynamics are complex)

5.10 TEMPORAL

Definition: Attacks targeting timing and duration.

Attack Pattern	Target	Example
Timeline compression	Deadlines	"What if the deadline is moved up 3 months?"
Timeline extension impact	Delays	"What if this takes twice as long?"
Market window closure	Timing	"The opportunity won't exist in 12 months"
Technology obsolescence	Tech lifecycle	"This technology will be obsolete"
Regulatory timeline	External deadlines	"Regulation changes in 6 months"

Risk Level: HIGH (timing failures are often unrecoverable)

6. Convergence Criteria

Mode Definitions

Mode	Definition	Best For
`no_new_critical`	Stop when round produces 0 new CRITICAL or HIGH attacks	Most use cases
`all_addressed`	Stop when no ACCEPT responses remain	High-stakes decisions
`round_limit`	Stop at `max_rounds` regardless	Time-constrained reviews

Measurement Methods

no_new_critical:

Count CRITICAL attacks generated this round: must be 0
Count HIGH attacks generated this round: must be 0
Attacks that are variants of previous attacks don't count as "new"

all_addressed:

Count ACCEPT responses across all rounds
Must be 0 (all attacks have REFUTE, MITIGATE, or HARDEN)

round_limit:

Simply check current_round >= max_rounds

Override Conditions (Continue Despite Convergence)

Obvious attack categories not yet explored
Stakeholder requests additional rounds
Blue Team defenses appear superficial
Recent hardening changes may introduce new vulnerabilities

Premature Termination Signs (Don't Stop Too Early)

Less than 2 rounds completed
CRITICAL attacks still have ACCEPT responses
Attack quality improving (not diminishing) each round
Key experience pool patterns not yet probed

7. Output Specifications

7.1 Primary Output: RISK-ASSESSMENT

Compliant with CONTRACT-08 from artifact-contracts.yaml.

See: templates/risk-assessment-output.md for complete XML template.

Key extensions for adversarial validation:

<adversarial_summary> with attack/defense statistics
Risks traced to source attacks (ATK-X-Y)
Battle-tested confidence score
Defense quality assessment

7.2 Secondary Output: Hardened Proposition

See: templates/hardened-proposition-output.md for complete template.

Includes:

Original proposition vs. battle-tested version
All modifications with rationale
Accepted residual risks
Conditions for validity
Review triggers

7.3 Secondary Output: Attack/Defense Log

See: templates/attack-defense-log.md for complete template.

Includes:

Round-by-round attack and defense tables
Convergence evaluation per round
Severity distribution
Resolution statistics

8. Quality Gates Summary

#	Gate	Criterion	Phase
1	Attack Surface Mapped	Proposition boundaries defined, categories selected	Pre-Round
2	Experience Pool Loaded	Domain patterns available (if enabled)	Pre-Round
3	Attacks Substantive	Attacks are non-trivial, steel-manned	Round N: Red
4	Attacks Diverse	At least 2 different categories represented	Round N: Red
5	Severities Assigned	All attacks have severity with rationale	Round N: Red
6	All Attacks Addressed	Every attack has a defense response	Round N: Blue
7	Critical Attacks Defended	CRITICAL/HIGH have REFUTE or MITIGATE	Round N: Blue
8	No Hand-Waving	Defenses are actionable, not vague	Round N: Blue
9	Convergence Evaluated	Explicit continue/stop decision	Round N: Eval
10	Risks Derived	Unresolved attacks become risks	Post-Round
11	Go/No-Go Issued	Clear recommendation	Post-Round
12	Hardened Proposition	Battle-tested version documented	Post-Round

Gate Requirements by Intensity

Gate	Light	Standard	Aggressive
Attack categories	3	5	All applicable
Minimum attacks	5	10	15+
Steel-manning level	minimal	standard	maximum
Convergence mode	round_limit (2)	no_new_critical	no_new_critical
Max rounds	2	3	5

9. Behavioral Guidelines

Red Team Principles

Steel-man, don't strawman: Make attacks as strong as possible
Attack the proposition, not the proposer: Focus on ideas, not people
Be creative but realistic: Novel attacks should be plausible
Prioritize ruthlessly: CRITICAL issues first
Use the experience pool: Don't reinvent known failures
Ideological Turing test: Would a true critic accept this attack?

Blue Team Principles

Defend genuinely, don't dismiss: Every attack deserves honest consideration
Evidence over assertion: REFUTE claims need proof
Actionable mitigations: MITIGATE responses must be specific
Honest acceptance: If you can't defend, ACCEPT the risk
Harden proactively: Don't wait for attacks to strengthen
Avoid defensive denial: Admitting weakness is strength

Tone Calibration

Intensity	Red Team Tone	Blue Team Tone
Light	Collaborative skeptic	Quick sanity check
Standard	Professional adversary	Thorough defense
Aggressive	Determined opponent	Comprehensive rebuttal

10. Workflow Integration

Upstream Skills

Skill	Provides	Use Case
`assumption-validator`	Assumption inventory	Attack assumptions already surfaced
`expert-panel-deliberation`	Multi-perspective input	Diverse attack/defense viewpoints
`research-interviewer`	KNOWLEDGE-CORPUS	Domain knowledge for attacks

Downstream Skills

Skill	Receives	Use Case
`expert-panel-deliberation`	RISK-ASSESSMENT	Panel review of risks
`generate-ideas`	Attack gaps	Generate alternatives for failed propositions

Skill Chaining Example

assumption-validator      → RISK-ASSESSMENT (assumption-derived)
                               ↓
red-blue-validator        → RISK-ASSESSMENT (adversarial-derived)
                               ↓
expert-panel-deliberation → Final recommendation with multi-expert review

11. References

Document	Purpose
`references/attack-vector-catalog.md`	10 attack categories with specific attacks
`references/red-team-techniques.md`	8 attack generation techniques
`references/blue-team-techniques.md`	8 defense techniques
`references/steel-manning-protocol.md`	Protocol for maximizing attack strength
`references/convergence-criteria.md`	Detailed criteria for stopping
`references/experience-pool-patterns.md`	50+ failure patterns by domain

Core Library References

Library	Element	Usage
`core/skill-patterns.yaml`	PATTERN-06: ADVERSARIAL-VALIDATE	Workflow pattern
`core/artifact-contracts.yaml`	CONTRACT-08: RISK-ASSESSMENT	Output format
`core/scoring-rubrics.yaml`	RUBRIC-07: SEVERITY-SCORING	Attack severity
`core/technique-taxonomy.yaml`	CAT-UR, CAT-PP	Adversarial techniques

12. Templates

Template	Purpose
`templates/risk-assessment-output.md`	CONTRACT-08 compliant RISK-ASSESSMENT with adversarial extensions
`templates/attack-defense-log.md`	Round-by-round attack/defense documentation
`templates/hardened-proposition-output.md`	Battle-tested proposition with modifications

13. Examples

Example 1: Architecture Decision — Microservices Migration

yaml

input:
  subject: "Migrate payment processing from monolith to microservices"
  subject_type: architecture
  max_rounds: 3
  attack_intensity: standard
  convergence_mode: no_new_critical
  include_experience_pool: true
  steel_manning_level: standard

flow:
  pre_round:
    proposition: "Decompose payment monolith into 5 microservices over 12 months"
    attack_surface:
      - ASSUMPTIONS: Team capability, timeline, complexity estimates
      - DEPENDENCIES: Infrastructure, vendor APIs, data consistency
      - SCALABILITY: Service coordination overhead
      - OPERATIONAL: Debugging distributed systems
      - EDGE_CASES: Partial failures, network partitions
    experience_pool_loaded:
      - "Distributed monolith anti-pattern"
      - "Service boundary misalignment"
      - "Operational complexity explosion"

  round_1:
    red_team:
      attacks:
        - ATK-1-1: "Team has zero production microservices experience"
          Category: ORGANIZATIONAL
          Severity: CRITICAL
          Steel-manned: "Even with training, production microservices require
                        tacit knowledge that only comes from operating them"

        - ATK-1-2: "12-month timeline ignores learning curve and unknowns"
          Category: TEMPORAL
          Severity: HIGH
          Steel-manned: "Industry benchmarks show microservices migrations
                        typically take 2-3x initial estimates"

        - ATK-1-3: "Distributed transactions will break payment consistency"
          Category: EDGE_CASES
          Severity: CRITICAL
          Steel-manned: "Payment systems require ACID guarantees that
                        eventual consistency cannot provide"

        - ATK-1-4: "Debugging distributed payment failures at 3 AM"
          Category: OPERATIONAL
          Severity: HIGH
          Steel-manned: "When payments fail across service boundaries,
                        MTTR could exceed SLA without distributed tracing expertise"

      new_critical: 2
      new_high: 2

    blue_team:
      defenses:
        - DEF-1-1: Response to ATK-1-1
          Type: MITIGATE
          Defense: "Hire 2 senior engineers with microservices experience.
                   Engage architecture consultancy for first 6 months."
          Residual: REDUCED

        - DEF-1-2: Response to ATK-1-2
          Type: HARDEN
          Defense: "Extend timeline to 18 months. Add 3-month buffer for unknowns."
          Proposition Change: "12 months" → "18 months with 3-month buffer"
          Residual: ELIMINATED

        - DEF-1-3: Response to ATK-1-3
          Type: HARDEN
          Defense: "Keep payment processing in single service with ACID guarantees.
                   Only extract non-critical services to microservices."
          Proposition Change: "5 microservices" → "3 microservices + 1 payment service"
          Residual: ELIMINATED

        - DEF-1-4: Response to ATK-1-4
          Type: MITIGATE
          Defense: "Implement distributed tracing (Jaeger) before migration.
                   Establish on-call runbooks. Require observability as launch gate."
          Residual: REDUCED

    evaluation:
      new_critical: 2
      new_high: 2
      convergence_mode: no_new_critical
      decision: CONTINUE
      rationale: "New critical attacks found; continue to Round 2"

  round_2:
    red_team:
      attacks:
        - ATK-2-1: "Hiring 2 senior engineers in 6 months is optimistic"
          Category: ORGANIZATIONAL
          Severity: HIGH
          Steel-manned: "Market for microservices expertise is extremely
                        competitive; 6-month hiring timeline may slip"

        - ATK-2-2: "Distributed tracing adds operational complexity itself"
          Category: OPERATIONAL
          Severity: MEDIUM
          Steel-manned: "Jaeger requires infrastructure, maintenance,
                        and expertise to operate"

        - ATK-2-3: "Service boundary around payments may be wrong"
          Category: ASSUMPTIONS
          Severity: MEDIUM
          Steel-manned: "Without event storming, service boundaries
                        are guesses that may need rework"

      new_critical: 0
      new_high: 1

    blue_team:
      defenses:
        - DEF-2-1: Response to ATK-2-1
          Type: MITIGATE
          Defense: "Begin hiring immediately. Have contingency: extend
                   consultancy or use contractor bridge if hiring slips."
          Residual: REDUCED

        - DEF-2-2: Response to ATK-2-2
          Type: ACCEPT
          Defense: "Accept additional complexity as cost of observability.
                   Allocate 0.5 FTE for observability platform."
          Residual: UNCHANGED (but monitored)

        - DEF-2-3: Response to ATK-2-3
          Type: HARDEN
          Defense: "Conduct event storming workshop before finalizing
                   service boundaries. Add 4 weeks for domain modeling."
          Proposition Change: Add event storming phase
          Residual: ELIMINATED

    evaluation:
      new_critical: 0
      new_high: 1
      convergence_mode: no_new_critical
      decision: CONTINUE (HIGH attack found)

  round_3:
    red_team:
      attacks:
        - ATK-3-1: "Event storming may reveal the migration is unnecessary"
          Category: ASSUMPTIONS
          Severity: MEDIUM
          Steel-manned: "Domain modeling might show modular monolith
                        is sufficient for scaling needs"

        - ATK-3-2: "Consultancy dependency creates knowledge transfer risk"
          Category: DEPENDENCIES
          Severity: MEDIUM
          Steel-manned: "If consultants leave, tacit knowledge leaves with them"

      new_critical: 0
      new_high: 0

    blue_team:
      defenses:
        - DEF-3-1: Response to ATK-3-1
          Type: ACCEPT
          Defense: "Valid point. Event storming is a gate; if it reveals
                   microservices aren't needed, we pivot to modular monolith."
          Residual: UNCHANGED (but this is a feature, not a bug)

        - DEF-3-2: Response to ATK-3-2
          Type: MITIGATE
          Defense: "Require knowledge transfer sessions, documentation
                   deliverables, and pair programming in consultancy contract."
          Residual: REDUCED

    evaluation:
      new_critical: 0
      new_high: 0
      convergence_mode: no_new_critical
      decision: CONVERGED
      rationale: "No new CRITICAL or HIGH attacks. Proposition has been hardened."

  post_round:
    hardened_proposition:
      original: "Migrate payment processing from monolith to 5 microservices
                over 12 months"

      battle_tested: "Migrate to 3 microservices + 1 payment service over
                     18 months (with 3-month buffer), after event storming
                     confirms boundaries. Requires: 2 hired senior engineers,
                     architecture consultancy, distributed tracing infrastructure,
                     0.5 FTE observability platform maintenance."

      modifications:
        - MOD-1: 5 services → 3 + 1 payment (Response to ATK-1-3)
        - MOD-2: 12 months → 18 months + buffer (Response to ATK-1-2)
        - MOD-3: Added event storming prerequisite (Response to ATK-2-3)
        - MOD-4: Added observability infrastructure requirement (Response to ATK-1-4)

      accepted_residual_risks:
        - "Hiring timeline may slip (mitigated by contingency)"
        - "Observability platform adds operational overhead (accepted)"
        - "Event storming may reveal migration unnecessary (feature)"
        - "Consultancy knowledge transfer requires active management"

      battle_tested_confidence: 72
      confidence_rationale: "Proposition survived 3 rounds of substantive attacks.
                           Critical issues addressed through hardening.
                           Residual risks are manageable and monitored."

    risk_assessment:
      total_risks: 4
      critical_risks: 0
      high_risks: 1 (hiring timeline)
      moderate_risks: 3
      risk_profile: MODERATE
      go_no_go: PROCEED_WITH_CAUTION

      recommendation: |
        PROCEED_WITH_CAUTION — The original proposition had critical flaws
        (team capability, payment consistency, timeline). The hardened
        proposition addresses these through:
        - Scoped migration (keeping payments transactional)
        - Extended timeline with buffer
        - Event storming validation gate
        - Observability prerequisites

        Key risks to monitor:
        1. Hiring: Start immediately; have contingency ready
        2. Consultancy knowledge transfer: Contract requirements
        3. Event storming outcome: Be prepared to pivot if boundaries don't hold

Example 2: Strategy Validation — Market Expansion

yaml

input:
  subject: "Expand to European market in Q3 with existing product"
  subject_type: strategy
  max_rounds: 2
  attack_intensity: standard
  convergence_mode: no_new_critical

flow:
  round_1:
    red_team:
      attacks:
        - ATK-1-1: "GDPR compliance timeline is unrealistic"
          Severity: CRITICAL
        - ATK-1-2: "Competitor X already dominates EU market"
          Severity: HIGH
        - ATK-1-3: "Pricing model assumes US willingness-to-pay"
          Severity: HIGH
        - ATK-1-4: "No local sales team or market knowledge"
          Severity: MEDIUM

    blue_team:
      defenses:
        - DEF-1-1: HARDEN - Push launch to Q4; engage GDPR consultancy
        - DEF-1-2: MITIGATE - Focus on underserved segments competitor ignores
        - DEF-1-3: HARDEN - Conduct pricing research; plan EU-specific pricing
        - DEF-1-4: MITIGATE - Partner with EU distributor initially

    evaluation:
      new_critical: 1
      decision: CONTINUE

  round_2:
    red_team:
      attacks:
        - ATK-2-1: "EU distributor takes 40% margin"
          Severity: MEDIUM
        - ATK-2-2: "Underserved segments may be underserved for good reason"
          Severity: MEDIUM

    blue_team:
      defenses:
        - DEF-2-1: ACCEPT - Cost of market entry; build direct sales in Year 2
        - DEF-2-2: MITIGATE - Validate segment with EU market research

    evaluation:
      new_critical: 0
      new_high: 0
      decision: CONVERGED

  post_round:
    hardened_proposition: |
      Launch EU expansion in Q4 (not Q3) with:
      - GDPR compliance verified by consultancy
      - EU-specific pricing based on market research
      - Initial distribution through EU partner
      - Focus on [specific underserved segments]
      - Year 2: Build direct sales capability

    risk_assessment:
      profile: MODERATE
      go_no_go: PROCEED_WITH_CAUTION

Example 3: Investment Decision — Series B Funding

yaml

input:
  subject: "Accept Series B term sheet at $50M valuation"
  subject_type: investment
  max_rounds: 2
  attack_intensity: standard
  convergence_mode: no_new_critical

flow:
  pre_round:
    proposition: "Accept $15M Series B at $50M pre-money valuation from [VC Firm]"
    attack_surface:
      - ECONOMIC: Valuation, dilution, runway
      - ASSUMPTIONS: Growth projections, market size
      - DEPENDENCIES: VC firm reputation, board dynamics
      - COMPETITIVE: Market timing, competitor funding

  round_1:
    red_team:
      attacks:
        - ATK-1-1: "Valuation assumes 3x YoY growth; current trajectory is 1.8x"
          Category: ASSUMPTIONS
          Severity: HIGH
          Steel-manned: "At 1.8x growth, next round valuation math doesn't work;
                        down round likely in 18 months"

        - ATK-1-2: "15-month runway at current burn; need to hit milestones or raise bridge"
          Category: ECONOMIC
          Severity: HIGH
          Steel-manned: "Milestones require growth acceleration you haven't demonstrated"

        - ATK-1-3: "[VC Firm] has reputation for replacing founders at Series C"
          Category: DEPENDENCIES
          Severity: MEDIUM
          Steel-manned: "3 of their last 5 Series B companies had founder transitions"

        - ATK-1-4: "Competitor just raised $40M; will outspend on customer acquisition"
          Category: COMPETITIVE
          Severity: HIGH
          Steel-manned: "Their CAC advantage compounds; market share gap widens"

    blue_team:
      defenses:
        - DEF-1-1: Response to ATK-1-1
          Type: HARDEN
          Defense: "Negotiate milestone-based valuation adjustment; lower initial
                   valuation with ratchet up if growth targets hit"
          Proposition Change: Add milestone ratchet provision

        - DEF-1-2: Response to ATK-1-2
          Type: MITIGATE
          Defense: "Negotiate 18-month runway minimum; reduce burn by 20% through
                   hiring pause; extend runway to 20 months"
          Residual: REDUCED

        - DEF-1-3: Response to ATK-1-3
          Type: MITIGATE
          Defense: "Negotiate founder-friendly protective provisions; 2-year
                   employment agreements; board composition safeguards"
          Residual: REDUCED

        - DEF-1-4: Response to ATK-1-4
          Type: ACCEPT
          Defense: "Competitive pressure is real but unavoidable. Focus on
                   capital-efficient growth and product differentiation over
                   CAC war. This is market reality, not term sheet issue."
          Residual: UNCHANGED

    evaluation:
      new_critical: 0
      new_high: 3
      decision: CONTINUE

  round_2:
    red_team:
      attacks:
        - ATK-2-1: "Milestone ratchet creates misaligned incentives; may optimize
                   for metrics over business health"
          Category: ASSUMPTIONS
          Severity: MEDIUM

        - ATK-2-2: "Hiring pause delays product roadmap; competitive gap widens"
          Category: TEMPORAL
          Severity: MEDIUM

    blue_team:
      defenses:
        - DEF-2-1: Response to ATK-2-1
          Type: MITIGATE
          Defense: "Structure milestones around leading indicators (retention,
                   NPS) not just growth metrics"
          Residual: REDUCED

        - DEF-2-2: Response to ATK-2-2
          Type: ACCEPT
          Defense: "Trade-off accepted; survival > speed. Revisit hiring
                   after 6-month runway checkpoint."
          Residual: UNCHANGED

    evaluation:
      new_critical: 0
      new_high: 0
      decision: CONVERGED

  post_round:
    hardened_proposition: |
      Accept Series B with modifications:
      - Milestone-based valuation: $45M base + $10M ratchet if 2.5x growth
      - 18-month minimum runway commitment
      - Founder protective provisions (2-year agreements, board balance)
      - Hiring pause for 6 months; revisit at runway checkpoint
      - Milestones tied to retention/NPS, not just growth

    risk_assessment:
      total_risks: 4
      risk_profile: MODERATE
      go_no_go: PROCEED_WITH_CAUTION

      recommendation: |
        PROCEED_WITH_CAUTION — Accept modified term sheet. Key risks:
        1. Competitive pressure (accepted as market reality)
        2. Growth trajectory uncertainty (mitigated by ratchet)
        3. Founder/board dynamics (mitigated by provisions)

        Negotiate the hardened terms before signing. Walk away if
        milestone ratchet or protective provisions rejected.

14. Quick Start

Minimal Invocation

Red team this: [paste proposition]

Standard Invocation

subject_type: decision
attack_intensity: standard
convergence_mode: no_new_critical

Proposition: [description or document]

Full Parameter Invocation

subject_type: architecture
max_rounds: 4
attack_intensity: aggressive
attack_categories: [ASSUMPTIONS, SCALABILITY, SECURITY, OPERATIONAL]
convergence_mode: all_addressed
include_experience_pool: true
steel_manning_level: maximum
output_mode: full

Proposition: [detailed description]

Context:
- Stakes: [why this matters]
- Constraints: [limitations]
- Stakeholders: [who cares]

Search AI Tools

Install this agent skill to your Project

SKILL.md

Red/Blue Team Validator

1. Purpose

Core Value Proposition

Capabilities

2. When to Use

Ideal Use Cases

Anti-Patterns (When NOT to Use)

3. Parameters

Parameter Effects Matrix

Auto-Selected Attack Categories by Subject Type

4. Checkpoints

5. Iterative Workflow

Workflow Overview

Pre-Round Setup

Round N: Red Team Phase

Round N: Blue Team Phase

Round N: Evaluation Phase

Post-Round Synthesis

5. Attack Vector Catalog

5.1 ASSUMPTIONS

5.2 DEPENDENCIES

5.3 EDGE_CASES

5.4 SCALABILITY

5.5 SECURITY

5.6 COMPETITIVE

5.7 OPERATIONAL

5.8 ECONOMIC

5.9 ORGANIZATIONAL

5.10 TEMPORAL

6. Convergence Criteria

Mode Definitions

Measurement Methods

Override Conditions (Continue Despite Convergence)

Premature Termination Signs (Don't Stop Too Early)

7. Output Specifications

7.1 Primary Output: RISK-ASSESSMENT

7.2 Secondary Output: Hardened Proposition

7.3 Secondary Output: Attack/Defense Log

8. Quality Gates Summary

Gate Requirements by Intensity

9. Behavioral Guidelines

Red Team Principles

Blue Team Principles

Tone Calibration

10. Workflow Integration

Upstream Skills

Downstream Skills

Skill Chaining Example

11. References

Core Library References

12. Templates

13. Examples

Example 1: Architecture Decision — Microservices Migration

Example 2: Strategy Validation — Market Expansion

Example 3: Investment Decision — Series B Funding

14. Quick Start

Minimal Invocation

Standard Invocation

Full Parameter Invocation