Multi-Agent PR Review

This skill creates three independent sub-agents to review code changes, then aggregates their findings using consensus voting.

Overview

Fetch PR diff files and existing comments
Spawn 3 sub-agents, each receiving files in different randomized order
- Agent 1: Code Health focus (maintainability, clarity, abstractions)
- Agents 2-3: Default focus (correctness, bugs, security)
Each agent reviews and classifies issues (high/medium/low criticality)
Aggregate results: report issues where 2+ agents agree
Filter out issues already commented on (deduplication)
Post findings: summary table + inline comments for HIGH/MEDIUM issues

Workflow

Step 1: Fetch PR Diff

bash

# Get changed files from PR
gh pr diff <PR_NUMBER> --repo <OWNER/REPO> > pr_diff.patch

# Or get list of changed files
gh pr view <PR_NUMBER> --repo <OWNER/REPO> --json files -q '.files[].path'

Step 2: Run Multi-Agent Review

Execute the orchestrator script:

bash

python3 scripts/orchestrate_review.py \
  --pr-number <PR_NUMBER> \
  --repo <OWNER/REPO> \
  --diff-file pr_diff.patch

The orchestrator:

Parses the diff into individual file changes
Creates 3 shuffled orderings of the files
Spawns 3 parallel sub-agent API calls
Collects and aggregates results

Step 3: Review Prompt Templates

Sub-agents receive role-specific prompts (see references/review_prompt.md):

Agent 1 (Code Health):

Focuses on maintainability, code clarity, abstractions, debugging ease
Rates sloppy code that hurts maintainability as MEDIUM severity

Agents 2-3 (Default):

Focus on correctness, bugs, security, edge cases
Also flag significant maintainability issues as MEDIUM

Severity levels:
HIGH: Security vulnerabilities, data loss risks, crashes, broken functionality
MEDIUM: Logic errors, edge cases, performance issues, AND sloppy code that
        significantly hurts maintainability (confusing logic, poor abstractions)
LOW: Minor style issues, nitpicks, minor improvements

Output JSON array of issues.

Step 4: Consensus Aggregation & Deduplication

Issues are matched across agents by file + approximate line range + issue type. An issue is reported only if:

2+ agents identified it AND
At least one agent rated it MEDIUM or higher

Deduplication: Before posting, the script fetches existing PR comments and filters out issues that have already been commented on (matching by file, line, and issue keywords). This prevents duplicate comments when re-running the review.

Step 5: Post PR Comments

The script posts two types of comments:

Summary comment: Overview table with issue counts (always posted, even if no new issues)
Inline comments: Detailed feedback on specific lines (HIGH/MEDIUM only)

bash

python3 scripts/post_comment.py \
  --pr-number <PR_NUMBER> \
  --repo <OWNER/REPO> \
  --results consensus_results.json

Options:

--dry-run: Preview comments without posting
--summary-only: Only post summary, skip inline comments

Example Summary Comment

markdown

## :mag: Multi-Agent Code Review

Found **4** new issue(s) flagged by 3 independent reviewers.
(2 issue(s) skipped - already commented)

### Summary

| Severity               | Count |
| ---------------------- | ----- |
| :red_circle: HIGH      | 1     |
| :yellow_circle: MEDIUM | 2     |
| :green_circle: LOW     | 1     |

### Issues to Address

| Severity               | File                     | Issue                                    |
| ---------------------- | ------------------------ | ---------------------------------------- |
| :red_circle: HIGH      | `src/auth/login.ts:45`   | SQL injection in user lookup             |
| :yellow_circle: MEDIUM | `src/utils/cache.ts:112` | Missing error handling for Redis failure |
| :yellow_circle: MEDIUM | `src/api/handler.ts:89`  | Confusing control flow - hard to debug   |

<details>
<summary>:green_circle: Low Priority Issues (1 items)</summary>

- **Inconsistent naming convention** - `src/utils/helpers.ts:23`

</details>

See inline comments for details.

_Generated by multi-agent consensus review_

File Structure

scripts/
  orchestrate_review.py  - Main orchestrator, spawns sub-agents
  aggregate_results.py   - Consensus voting logic
  post_comment.py        - Posts findings to GitHub PR
references/
  review_prompt.md       - Sub-agent review prompt template
  issue_schema.md        - JSON schema for issue output

Configuration

Environment variables:

ANTHROPIC_API_KEY - Required for sub-agent API calls
GITHUB_TOKEN - Required for PR access and commenting

Optional tuning in orchestrate_review.py:

NUM_AGENTS - Number of sub-agents (default: 3)
CONSENSUS_THRESHOLD - Min agents to agree (default: 2)
MIN_SEVERITY - Minimum severity to report (default: MEDIUM)
THINKING_BUDGET_TOKENS - Extended thinking budget (default: 128000)
MAX_TOKENS - Maximum output tokens (default: 128000)

Extended Thinking

This skill uses extended thinking (interleaved thinking) with max effort by default. Each sub-agent leverages Claude's extended thinking capability for deeper code analysis:

Budget: 128,000 thinking tokens per agent for thorough reasoning
Max output: 128,000 tokens for comprehensive issue reports

To disable extended thinking (faster but less thorough):

bash

python3 scripts/orchestrate_review.py \
  --pr-number <PR_NUMBER> \
  --repo <OWNER/REPO> \
  --diff-file pr_diff.patch \
  --no-thinking

To customize thinking budget:

bash

python3 scripts/orchestrate_review.py \
  --pr-number <PR_NUMBER> \
  --repo <OWNER/REPO> \
  --diff-file pr_diff.patch \
  --thinking-budget 50000

Search AI Tools

dyad:multi-pr-review

Install this agent skill to your Project

SKILL.md

Multi-Agent PR Review

Overview

Workflow

Step 1: Fetch PR Diff

Step 2: Run Multi-Agent Review

Step 3: Review Prompt Templates

Step 4: Consensus Aggregation & Deduplication

Step 5: Post PR Comments

Example Summary Comment

File Structure

Configuration

Extended Thinking