Agent skill

brainstorming

Collaborative ideation and planning with resilient multi-model exploration, consensus scoring, and adaptive confidence-based validation

View SKILL.md on GitHub Repository

Stars 248

Forks 27

Install this agent skill to your Project

npx add-skill https://github.com/MadAppGang/claude-code/tree/main/plugins/dev/skills/planning/brainstorming

SKILL.md

Brainstorming v2.0: Resilient Multi-Model Planning

Turn ideas into validated designs through collaborative AI dialogue with resilient model execution and confidence-based validation.

Overview

This skill improves upon v1.0 by addressing critical reliability gaps:

Key v2.0 Improvements:

No AskUserQuestion dependency: Uses Task + Tasks for structured interaction
Fallback chains: 3+ models per role ensures completion even if some fail
Explicit parallelism: Documented Task call patterns for parallel execution
Defined algorithms: Consensus matrix and confidence scoring are mathematically specified

When to Use

Use this skill BEFORE implementing any feature:

"Design a user authentication system"
"Brainstorm approaches for API rate limiting"
"Plan architecture for a new dashboard feature"
"Evaluate options for real-time data synchronization"

Prerequisites

Required Setup

bash

# 1. Install required skills
/plugin marketplace add MadAppGang/claude-code
skill install superpowers:using-git-worktrees
skill install superpowers:writing-plans

# 2. Verify OpenRouter access (for multi-model)
export OPENROUTER_API_KEY=your-key

# 3. Configure models in ~/.claude/settings.json
{
  "brainstorming": {
    "primary_model": "anthropic/claude-opus-4-20250514",
    "explorer_models": [
      "x-ai/grok-code-fast-1",
      "google/gemini-2-5-pro",
      "anthropic/claude-sonnet-4-20250514"
    ]
  }
}

Model Requirements

Role	Min Context	Capabilities
Primary	200K tokens	Complex reasoning, orchestration
Explorer	100K tokens	Code generation, analysis

Workflow

Phase 0: Problem Analysis (200-300 words)

Objective: Capture problem scope, constraints, and success criteria

How to Ask Users (Without AskUserQuestion):

typescript

// Pattern: Use Tasks to track questions, Read/Write for presentation

// 1. Write question to temp file
await Write({
  file_path: "/tmp/brainstorm-q1.md",
  content: `## Question 1 of 3

**What are the main constraints or requirements for this feature?**

Please respond with:
- Functional requirements (what it must do)
- Non-functional requirements (performance, scale)
- Any existing dependencies or integrations
`
});

// 2. Present file and wait for user response
// User reads file, provides input via conversation

// 3. Summarize understanding
const problemSummary = await Write({
  file_path: "/tmp/brainstorm-problem.md",
  content: `## Problem Understanding

**Constraints identified:**
- [From user response]

**Success criteria:**
- [Measurable outcomes]

**Scope boundaries:**
- [What's in/out]

---

**Does this accurately capture the problem?** (Reply "yes" to proceed or clarify)
`
});

Gate Type: USER_GATE (requires confirmation)

Phase 1: Parallel Exploration

Objective: Generate diverse solutions via multi-model brainstorming

Fallback Chain Implementation:

typescript

interface ModelResult {
  model: string;
  success: boolean;
  output?: string;
  error?: string;
}

async function exploreWithFallback(
  prompt: string,
  role: "explorer"
): Promise<ModelResult> {
  const fallbackModels = role === "explorer"
    ? ["x-ai/grok-code-fast-1", "google/gemini-2-5-pro", "deepseek/deepseek-coder"]
    : ["anthropic/claude-opus-4-20250514", "anthropic/claude-sonnet-4-20250514"];

  for (const model of fallbackModels) {
    try {
      const result = await Task({
        model: model,
        prompt: prompt,
        timeout_ms: 120000  // 2 minute timeout
      });

      return { model, success: true, output: result };
    } catch (error) {
      console.warn(`Model ${model} failed:`, error.message);
      continue;  // Try next in chain
    }
  }

  throw new Error(`All models in fallback chain failed`);
}

Parallel Execution Pattern:

typescript

// WRONG: Sequential (slow)
// const result1 = await Task({ model: "grok", ... });
// const result2 = await Task({ model: "gemini", ... });
// const result3 = await Task({ model: "sonnet", ... });

// CORRECT: Parallel (3-5x faster)
const [result1, result2, result3] = await Promise.all([
  Task({
    model: "x-ai/grok-code-fast-1",
    prompt: generateExplorerPrompt(problem, "fast_code")
  }),
  Task({
    model: "google/gemini-2-5-pro",
    prompt: generateExplorerPrompt(problem, "balanced")
  }),
  Task({
    model: "anthropic/claude-sonnet-4-20250514",
    prompt: generateExplorerPrompt(problem, "thorough")
  })
]);

// Handle partial failures
const results = [result1, result2, result3].filter(r => r.success);
if (results.length === 0) {
  throw new Error("All exploration models failed");
}

Output Format:

markdown

## Approach: [Name]

**Model**: [Which model generated this]
**Approach Type**: [architecture/algorithm/pattern]
**Summary**: 2-3 sentences

**Key Components**:
1. Component A
2. Component B
3. Component C

**Trade-offs**:
- + Advantage
- - Disadvantage

**Confidence**: [Model's confidence 0-100]

Gate Type: AUTO_GATE (automatic consolidation)

Phase 2: Consensus Analysis

Objective: Identify strongest ideas using defined algorithms

Consensus Matrix Algorithm:

Clustering: Group approaches by semantic similarity (vector embedding + clustering)
Scoring: Count model agreement per cluster
Classification: UNANIMOUS (3/3), STRONG (2/3), DIVERGENT (1/3)
Confidence: Weighted average of model confidences + agreement bonus

Consensus Matrix Calculation:

typescript

interface Approach {
  id: string;
  name: string;
  summary: string;
  model: string;  // Which model proposed
  modelConfidence: number;  // 0-100
  embedding: number[];  // For clustering
}

interface Cluster {
  approaches: Approach[];
  representative: Approach;  // Most complete
  agreementScore: number;  // 0-1
  confidenceScore: number;  // 0-100
  consensusLevel: "UNANIMOUS" | "STRONG" | "DIVERGENT";
}

function calculateConsensus(approaches: Approach[]): Cluster[] {
  // Step 1: Cluster by semantic similarity
  const clusters = clusterByEmbedding(approaches, threshold: 0.85);

  // Step 2: Calculate metrics per cluster
  return clusters.map(cluster => {
    const models = cluster.map(a => a.model);
    const modelCount = new Set(models).size;
    const totalModels = approaches.length;

    // Agreement: proportion of models that have an approach in this cluster
    const agreementScore = modelCount / totalModels;

    // Confidence: weighted average + agreement bonus
    const baseConfidence = cluster
      .map(a => a.modelConfidence)
      .reduce((a, b) => a + b, 0) / cluster.length;

    const confidenceScore = Math.min(100,
      baseConfidence + (agreementScore * 20)  // +20% for agreement
    );

    // Consensus classification
    const consensusLevel = agreementScore >= 0.9 ? "UNANIMOUS" :
                          agreementScore >= 0.5 ? "STRONG" :
                          "DIVERGENT";

    return {
      approaches: cluster,
      representative: cluster.reduce((best, current) =>
        current.modelConfidence > best.modelConfidence ? current : best
      ),
      agreementScore,
      confidenceScore: Math.round(confidenceScore),
      consensusLevel
    };
  }).sort((a, b) => b.confidenceScore - a.confidenceScore);
}

Confidence Scoring Formula:

Confidence = Base + AgreementBonus - DiversityPenalty

Where:
  Base = average(model confidences in cluster)
  AgreementBonus = (unique_models / total_models) * 20
  DiversityPenalty = (1 - similarity_coefficient) * 10

Example:
  3 models propose similar approaches
  Base = (92 + 88 + 95) / 3 = 91.7
  AgreementBonus = (3/3) * 20 = 20
  DiversityPenalty = (1 - 0.9) * 10 = 1
  Confidence = 91.7 + 20 - 1 = 110.7 -> capped at 100
  Final: 97%

Consensus Matrix Example:

Approach	Grok	Gemini	Sonnet	Agreement	Confidence
Token Bucket	Yes	Yes	Yes	UNANIMOUS	97%
Leaky Bucket	Yes	Yes	No	STRONG	82%
Sliding Window	No	No	Yes	DIVERGENT	45%

Gate Type: AUTO_GATE (automatic scoring)

Phase 3: User Selection

Objective: Present top approaches for user decision

Presentation Pattern:

typescript

async function presentApproaches(clusters: Cluster[]): Promise<string> {
  const topClusters = clusters.slice(0, 5);  // Top 5

  let presentation = `## Top Approaches\n\n`;

  for (const [index, cluster] of topClusters.entries()) {
    const approach = cluster.representative;

    presentation += `### ${String.fromCharCode(65 + index)}: ${approach.name} [${cluster.consensusLevel}]

**Summary**: ${approach.summary}

**Confidence**: ${cluster.confidenceScore}% (${cluster.approaches.length} model(s) agree)

**Pros**:
${cluster.approaches.map(a => `- ${a.summary}`).join("\n")}

**Cons**:
${cluster.approaches.map(a => `- Potential issue from ${a.model}`).join("\n")}

---
`;
  }

  presentation += `
## Your Choice

Which approach best fits your requirements?

- **A**: Select approach A
- **B**: Select approach B
- **C**: Select approach C
- **D**: Combine elements from multiple
- **E**: Explore alternatives (return to Phase 1)
`;

  // Save for user review
  await Write({
    file_path: "/tmp/brainstorm-approaches.md",
    content: presentation
  });

  return presentation;
}

Gate Type: USER_GATE (selection via conversation)

Phase 4: Detailed Planning

Objective: Elaborate selected approach into actionable sections

Confidence-Based Gating:

Confidence	Gate Type	Action
>=95%	AUTO_GATE	Proceed automatically
80-94%	AUTO_GATE	Proceed with notification
60-79%	USER_GATE	Request confirmation
<60%	USER_GATE	Require revision

Section Template:

markdown

## [Section Name] (Confidence: XX%)

**Approach**: [Selected approach]

**Implementation Details**:
[200-300 words]

**Assumptions**:
- Assumption 1
- Assumption 2

**Confidence Calculation**:
- Technical feasibility: XX%
- Edge cases covered: XX%
- Team capability: XX%
- Overall: XX%

**Status**: [AUTO_GATE|PENDING_USER] - [Reason]

Gate Type: MIXED_GATE (adaptive)

Phase 5: Plan Validation

Objective: Final review before implementation

Validation Checklist:

markdown

## Plan Validation

**Problem**: [Summary]
**Approach**: [Selected]
**Confidence**: [Overall]

### Checklist

- [ ] Problem scope accurately captured
- [ ] Chosen approach matches expectations
- [ ] Module structure aligns with capabilities
- [ ] Technical constraints addressed
- [ ] Success criteria measurable

### Next Steps

**To proceed**:
1. Reply "approve" to finalize
2. Reply "revise [section]" to modify
3. Reply "restart" to begin fresh

**Final decision?**

Gate Type: USER_GATE (explicit approval)

Complete Parallel Execution Example

typescript

// Complete Phase 1 parallel exploration
async function runParallelExploration(problem: string): Promise<Approach[]> {
  const explorerModels = [
    "x-ai/grok-code-fast-1",     // Fast, code-focused
    "google/gemini-2-5-pro",     // Balanced, creative
    "anthropic/claude-sonnet-4-20250514"  // Thorough
  ];

  const prompts = explorerModels.map(model =>
    `Generate 5 implementation approaches for: ${problem}

For each approach provide:
1. Name (2-3 words)
2. One-sentence summary
3. Key components (bullet points)
4. Trade-offs (+/-)
5. Your confidence (0-100)

Format as JSON array.`
  );

  // LAUNCH ALL MODELS IN PARALLEL
  const taskPromises = explorerModels.map((model, index) =>
    Task({
      model: model,
      prompt: prompts[index],
      timeout_ms: 120000,
      max_turns: 1
    }).catch(error => ({
      model,
      success: false,
      error: error.message
    }))
  );

  // WAIT FOR ALL TO COMPLETE
  const results = await Promise.all(taskPromises);

  // CONSOLIDATE SUCCESSFUL RESULTS
  const approaches: Approach[] = results
    .filter(r => r.success)
    .flatMap(r => parseApproaches(r.output));

  // HANDLE PARTIAL FAILURES
  if (approaches.length < 5) {
    console.warn(`Only got ${approaches.length} approaches from ${explorerModels.length} models`);
    if (approaches.length === 0) {
      throw new Error("All models failed");
    }
  }

  return approaches;
}

Troubleshooting

Model Failures

Symptom	Cause	Solution
Single model fails	API error, timeout	Fallback chain handles automatically
All models fail	API key issue, network	Check `OPENROUTER_API_KEY`, retry
Partial results (2/3)	One model unavailable	Continue with available; lower diversity but valid

Recovery Pattern:

typescript

async function resilientExploration(problem: string): Promise<Approach[]> {
  let attempts = 0;
  const maxAttempts = 3;

  while (attempts < maxAttempts) {
    try {
      return await runParallelExploration(problem);
    } catch (error) {
      attempts++;
      if (attempts === maxAttempts) throw error;

      // Exponential backoff
      await new Promise(r => setTimeout(r, Math.pow(2, attempts) * 1000));
    }
  }
}

Consensus Issues

Symptom	Cause	Solution
All approaches DIVERGENT	Models produce very different ideas	Not a failure - indicates novel problem
Single cluster with 90%+ confidence	Problem is well-understood	Good for AUTO_GATE
No clear winner	Multiple valid approaches	Present all to user

User Interaction Issues

Symptom	Cause	Solution
User doesn't respond	Unclear question	Rewrite with specific format
User provides conflicting answers	Multiple questions at once	Ask one at a time, confirm understanding
User wants to restart	Dissatisfied with direction	Allow restart to Phase 0

Configuration

Environment Variables

bash

# Required for multi-model
OPENROUTER_API_KEY=...

# Optional
BRAINSTORM_TIMEOUT_MS=120000
BRAINSTORM_MAX_RETRIES=2
BRAINSTORM_MIN_MODELS=2  # Minimum models for valid consensus

Model Configuration

json

{
  "brainstorming": {
    "primary": ["anthropic/claude-opus-4-20250514"],
    "explorers": {
      "primary_chain": ["x-ai/grok-code-fast-1", "google/gemini-2-5-pro"],
      "fallback_chain": ["deepseek/deepseek-coder", "anthropic/claude-haiku-3-20250514"]
    },
    "thresholds": {
      "auto_gate": 95,
      "confirm_gate": 60
    }
  }
}

Performance

Metric	v1.0	v2.0	Improvement
Model failure recovery	0%	95%	Fallback chains
Consensus calculation	Undefined	Defined	Mathematically specified
User interaction	AskUserQuestion	Task+Write	No tool dependency
Parallel execution	Implicit	Explicit	Documented patterns

Comparison: v1.0 vs v2.0

Aspect	v1.0	v2.0
AskUserQuestion	Required	Removed
Model fallbacks	None	3+ per role
Parallel pattern	Described	Code example
Consensus algorithm	Table example	Full implementation
Confidence formula	Mentioned	Math specified
Troubleshooting	4 items	10+ items
Prerequisites	None	Setup guide

Version History

v2.0.0 (2026-01-30):

Removed AskUserQuestion dependency
Added model fallback chains
Explicit parallel execution patterns
Defined consensus matrix algorithm
Added confidence scoring formula
Added prerequisites and setup guide
Expanded troubleshooting section
Winner of blind multi-model voting (3/3 votes, avg confidence 8.7/10)

v1.0.0 (2026-01-30):

Initial release
6-phase workflow
Multi-model exploration
Confidence-based gating

Status: v2.0 Ready for use Tested: Fallback chains, parallel execution, consensus algorithm Known Limitations: Requires OpenRouter for multi-model access

Maintainer

MadAppGang Core maintainer

Source details

Full Name: MadAppGang/claude-code
Branch: main
Path in repo: plugins/dev/skills/planning/brainstorming
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

MadAppGang/claude-code

test-skill

A test skill for validation testing. Use when testing skill parsing and validation logic.

248 27

Explore

MadAppGang/claude-code

bad-skill

248 27

Explore

MadAppGang/claude-code

claudish-usage

CRITICAL - Guide for using Claudish CLI ONLY through sub-agents to run Claude Code with OpenRouter models (Grok, GPT-5, Gemini, MiniMax). NEVER run Claudish directly in main context unless user explicitly requests it. Use when user mentions external AI models, Claudish, OpenRouter, or alternative models. Includes mandatory sub-agent delegation patterns, agent selection guide, file-based instructions, and strict rules to prevent context window pollution.

248 27

Explore

MadAppGang/claude-code

release

Plugin release process for MAG Claude Plugins marketplace. Covers version bumping, marketplace.json updates, git tagging, and common mistakes. Use when releasing new plugin versions or troubleshooting update issues.

248 27

Explore

MadAppGang/claude-code

claudish-integration

248 27

Explore

MadAppGang/claude-code

openrouter-trending-models

Fetch trending programming models from OpenRouter rankings. Use when selecting models for multi-model review, updating model recommendations, or researching current AI coding trends. Provides model IDs, context windows, pricing, and usage statistics from the most recent week.

248 27

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Brainstorming v2.0: Resilient Multi-Model Planning

Overview

When to Use

Prerequisites

Required Setup

Model Requirements

Workflow

Phase 0: Problem Analysis (200-300 words)

Phase 1: Parallel Exploration

Phase 2: Consensus Analysis

Phase 3: User Selection

Phase 4: Detailed Planning

Phase 5: Plan Validation

Complete Parallel Execution Example

Troubleshooting

Model Failures

Consensus Issues

User Interaction Issues

Configuration

Environment Variables

Model Configuration

Performance

Comparison: v1.0 vs v2.0

Version History

Recommended Agent Skills

test-skill

bad-skill

claudish-usage

release

claudish-integration

openrouter-trending-models