Agent skill
advanced-code-review
Use when performing thorough code review with historical context tracking. Triggers: 'thorough review', 'deep review', 'review this branch in detail', 'full code review with report'. More heavyweight than code-review; for quick review, use code-review instead.
Install this agent skill to your Project
npx add-skill https://github.com/axiomantic/spellbook/tree/main/skills/advanced-code-review
SKILL.md
Advanced Code Review
Announce: "Using advanced-code-review skill for multi-phase review with verification."
This is very important to my career. </ROLE>
Invariant Principles
- Verification Before Assertion: Never claim "line X contains Y" without reading line X. Every finding must be verifiable.
- Respect Previous Decisions: Declined items stay declined. Partial agreements note pending work. Alternatives, if accepted, are not re-raised.
- Severity Accuracy: Critical means data loss/security breach. High means broken functionality. Medium is quality concern. Low is polish. Nit is style.
- Evidence Over Opinion: "This could be slow" is not a finding. "O(n^2) loop at line 45 with n=10000 in hot path" is.
- Signal Maximization: Every finding in the report should be worth the reviewer's time to read.
Inputs
| Input | Required | Default | Description |
|---|---|---|---|
target |
Yes | - | Branch name, PR number (#123), or PR URL |
--base |
No | main/master | Custom base ref for comparison |
--scope |
No | all | Limit to specific paths (glob pattern) |
--offline |
No | auto | Force offline mode (no network operations) |
--continue |
No | false | Resume previous review session |
--json |
No | false | Output JSON only (for scripting) |
Outputs
| Output | Location | Description |
|---|---|---|
| review-manifest.json | reviews// | Review metadata and configuration |
| review-plan.md | reviews// | Phase 1 strategy document |
| context-analysis.md | reviews// | Phase 2 historical context |
| previous-items.json | reviews// | Declined/partial/alternative tracking |
| findings.md | reviews// | Phase 3 findings (human-readable) |
| findings.json | reviews// | Phase 3 findings (machine-readable) |
| verification-audit.md | reviews// | Phase 4 verification log |
| review-report.md | reviews// | Phase 5 final report |
| review-summary.json | reviews// | Machine-readable summary |
Output Location: ~/.local/spellbook/docs/<project-encoded>/reviews/<branch>-<merge-base-sha>/
Mode Router
| Target Pattern | Mode | Network Required | Source of Truth |
|---|---|---|---|
feature/xyz (branch name) |
Local | No | Local files |
#123 (PR number) |
PR | Yes | Diff only |
https://github.com/... (URL) |
PR | Yes | Diff only |
Any + --offline flag |
Local | No | Local files |
Implicit Offline Detection: If target is a local branch AND no --pr flag is present, operate in offline mode automatically.
When target is a PR number or URL, the fetched diff is the ONLY authoritative representation of the changed code. The local working tree reflects a DIFFERENT git state — it is on whatever branch was checked out when the review started, which is almost certainly not the PR branch.
Reading local files in PR mode produces silently wrong results:
- Changes introduced by the PR appear absent (local has the old code)
- Real bugs get declared "not present" → false REFUTED verdicts
- The review poisons findings with high confidence in wrong conclusions
Local files may only be read in PR mode for ONE purpose: loading project conventions (CLAUDE.md, linting config, sibling files for style context). Even then, only read files NOT in the PR's changed file set.
Before any local file read in PR mode: confirm git rev-parse HEAD matches the PR's headRefOid. If they differ, treat the local file as unavailable for that finding.
</CRITICAL>
Phase Overview
| Phase | Name | Purpose | Command |
|---|---|---|---|
| 1 | Strategic Planning | Scope analysis, risk categorization, priority ordering | /advanced-code-review-plan |
| 2 | Context Analysis | Load previous reviews, PR history, declined items | /advanced-code-review-context |
| 3 | Deep Review | Multi-pass code analysis, finding generation | /advanced-code-review-review |
| 4 | Verification | Fact-check findings, remove false positives | /advanced-code-review-verify |
| 5 | Report Generation | Produce final deliverables | /advanced-code-review-report |
Phase 1: Strategic Planning
Execute: /advanced-code-review-plan
Outputs: review-manifest.json, review-plan.md
Self-Check: Target resolved, files categorized, complexity estimated, artifacts written.
Memory-Informed Planning: After resolving the review target, proactively load relevant memory:
memory_recall(query="review finding [branch_or_module]")for prior findings on this areamemory_recall(query="false positive [project]")for known false positive patterns
Use recalled context to prioritize review passes and set expectations for finding density.
Phase 2: Context Analysis
Execute: /advanced-code-review-context
Outputs: context-analysis.md, previous-items.json
Self-Check: Previous items loaded, PR context fetched (if online), re-check requests extracted.
Note: Phase 2 failures are non-blocking. Proceed with empty context if necessary.
Cross-Session Context: If previous review artifacts are stale (>30 days) or missing, call memory_recall(query="review decision [component]") to recover decisions from memory. This extends the "Respect Previous Decisions" principle across sessions, not just within a single review cycle.
Note: The <spellbook-memory> auto-injection fires when reading files under review, but project-wide patterns and prior review decisions for OTHER files won't appear unless explicitly recalled.
Phase 3: Deep Review
Multi-pass analysis: Security, Correctness, Quality, and Polish passes.
Execute: /advanced-code-review-review
Outputs: findings.json, findings.md
Self-Check: All files reviewed, all passes complete, declined items respected, required fields present.
Phase 4: Verification
Execute: /advanced-code-review-verify
Outputs: verification-audit.md, updated findings.json
Self-Check: All findings verified, REFUTED removed, INCONCLUSIVE flagged, signal-to-noise calculated.
Persist Verified Findings: After verification, store findings with their verdicts:
memory_store_memories(memories='{"memories": [{"content": "[Finding]: [description]. Verdict: [CONFIRMED/REFUTED]. Evidence: [key evidence].", "memory_type": "[fact or antipattern]", "tags": ["review", "verified", "[category]"], "citations": [{"file_path": "[file]", "line_range": "[lines]"}]}]}')
- CONFIRMED findings: memory_type = "antipattern" (warns future reviews)
- REFUTED findings: memory_type = "fact" with tag "false-positive" (prevents re-flagging)
Phase 5: Report Generation
Execute: /advanced-code-review-report
Outputs: review-report.md, review-summary.json
Self-Check: Findings filtered and sorted, verdict determined, artifacts written.
Persist Review Summary: Store a high-level summary of the review outcome:
memory_store_memories(memories='{"memories": [{"content": "Review of [target]: [N] findings ([breakdown by severity]). Key themes: [themes]. Risk assessment: [level].", "memory_type": "fact", "tags": ["review-summary", "[target]", "[date]"], "citations": [{"file_path": "[report_path]"}]}]}')
This enables future reviews to reference historical review density and risk trends.
Constants and Configuration
Severity Order
SEVERITY_ORDER = {"CRITICAL": 0, "HIGH": 1, "MEDIUM": 2, "LOW": 3, "NIT": 4, "PRAISE": 5}
Configurable Thresholds
| Threshold | Default | Description |
|---|---|---|
STALENESS_DAYS |
30 | Max age of previous review before ignored |
LARGE_DIFF_LINES |
10000 | Lines threshold for chunked processing |
SUBAGENT_THRESHOLD_FILES |
20 | Files threshold for parallel subagent dispatch |
VERIFICATION_TIMEOUT_SEC |
60 | Max time for verification phase |
Offline Mode
| Feature | Online Mode | Offline Mode |
|---|---|---|
| PR metadata | Fetched | Skipped |
| PR comments | Fetched | Skipped |
| Re-check detection | Available | Not available |
Circuit Breakers
Stop execution when:
- Phase 1 fails to resolve target
- No changes found between target and base
- More than 3 consecutive verification failures
- Verification phase exceeds timeout
Recovery: Network unavailable falls back to offline. Corrupt previous review starts fresh. Unreadable files skipped with warning.
Final Self-Check
Before declaring review complete:
Phase Completion
- Phase 1: Target resolved, manifest written
- Phase 2: Context loaded, previous items parsed
- Phase 3: All passes complete, findings generated
- Phase 4: All findings verified, REFUTED removed
- Phase 5: Report rendered, artifacts written
Quality Gates
- Every finding has: id, severity, category, file, line, evidence
- No REFUTED findings in final report
- INCONCLUSIVE findings flagged with [NEEDS VERIFICATION]
- Declined items from previous review not re-raised
- Signal-to-noise ratio calculated and reported
Output Verification
- All 8 artifact files exist and are valid
Integration Points
MCP Tools
| Tool | Phase | Usage |
|---|---|---|
pr_fetch |
1, 2 | Fetch PR metadata for remote reviews |
pr_diff |
3 | Parse unified diff into structured format |
pr_files |
1 | Extract file list from PR |
pr_match_patterns |
1 | Categorize files by risk patterns |
Git Commands
| Command | Phase | Usage |
|---|---|---|
git merge-base |
1 | Find common ancestor with base |
git diff --name-only |
1 | List changed files |
git diff |
3 | Get full diff content |
git show |
4 | Verify file contents at SHA |
Fallback Chain
MCP pr_fetch -> gh pr view -> git diff (local only)
<FINAL_EMPHASIS> A code review is only as valuable as its accuracy. Verify before asserting. Respect previous decisions. Prioritize by impact. Your reputation depends on being thorough AND correct. </FINAL_EMPHASIS>
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
spellbook-auditing
Meta-audit skill for spellbook development. Spawns parallel subagents to factcheck docs, optimize instructions, find token savings, and identify MCP candidates. Produces actionable report.
documentation-updates
Use after modifying library skills, library commands, or agents to ensure CHANGELOG, README, and docs are updated
project-encyclopedia
[DEPRECATED] Use project-level AGENTS.md files instead. Previously used for first-session codebase onboarding and persistent glossary creation.
reviewing-impl-plans
Use when reviewing implementation plans before execution. Triggers: 'is this plan solid', 'review the plan', 'check before I start building', 'anything missing from this plan', 'will this plan work', 'audit the implementation plan'. NOT for: reviewing design documents (use reviewing-design-docs) or creating plans (use writing-plans).
session-resume
Session resume protocol and session repairs handling. Loaded when spellbook_session_init returns resume_available: true, or when session_init returns a repairs array. Triggers: 'resume', 'continue', 'where were we', session resume, session repairs.
brainstorming
Use when exploring design approaches, generating ideas, or making architectural decisions. Triggers: 'explore options', 'what are the tradeoffs', 'how should I approach', 'let's think through', 'sketch out an approach', 'I need ideas for', 'how would you structure', 'what are my options'. Also invoked by develop when design decisions are needed.
Didn't find tool you were looking for?