Agent skill

codebase-analyzer

Statistical rule discovery from Go codebase patterns.

Stars 324
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/notque/claude-code-toolkit/tree/main/skills/codebase-analyzer

SKILL.md

Codebase Analyzer Skill

Statistical rule discovery through measurement of Go codebases. Python scripts count patterns to avoid LLM training bias, then statistics are interpreted to derive confidence-scored rules. The core principle is Measure First, Interpret Second -- what IS in the code is the local standard, not what an LLM thinks "should be" there.

Instructions

Phase 1: CONFIGURE

Goal: Validate target and select analyzer variant.

Read and follow the repository's CLAUDE.md before doing anything else -- project instructions override default behaviors.

Step 1: Validate the target

  • Confirm path points to a Go repository root with .go files
  • Check for standard structure (cmd/, internal/, pkg/)
  • Verify sufficient file count: 50+ files for meaningful rules, 100+ ideal. Below 50 files, statistics produce high variance -- patterns that look consistent may be coincidence. For small repos, combine analysis across multiple team repos rather than treating thin data as definitive.

Step 2: Select cartographer variant

Variant Script Metrics Use When
Omni (recommended) cartographer_omni.py (not yet implemented) 100 across 25 categories Full codebase profiling
Basic cartographer.py (not yet implemented) ~15 categories Quick pattern overview
Ultimate cartographer_ultimate.py 6 focused categories Performance pattern detection

Step 3: Verify environment

  • Python 3.7+ available
  • No external dependencies needed (uses only Python standard library)
  • Output directories exist or can be created
===============================================================
 PHASE 1: CONFIGURE
===============================================================

 Target Repository:
   - Path: [/path/to/repo]
   - Go Files: [N files found]
   - Structure: [cmd/ | internal/ | pkg/ | flat]

 Variant Selected: [Omni | Basic | Ultimate]
 Reason: [why this variant]

 Validation:
   - [ ] Path exists and contains .go files
   - [ ] File count >= 50 (actual: N)
   - [ ] Python 3.7+ available
   - [ ] Output directory writable

 CONFIGURE complete. Proceeding to MEASURE...
===============================================================

Gate: Target directory exists, contains 50+ Go files, variant selected. Proceed only when gate passes.

Phase 2: MEASURE

Goal: Run statistical analysis scripts. Pure measurement -- no interpretation yet.

This phase is strictly mechanical. Scripts count and measure; keep interpretation separate from data collection. Combining measurement with interpretation introduces LLM training bias -- the model reports what "should be" instead of what IS. Run scripts first, interpret the numbers second, always as separate steps.

Automatically filter vendor/, testdata/, and generated code (files with "Code generated by..." markers) to avoid polluting statistics with external patterns.

Step 1: Execute the cartographer

bash
# TODO: scripts/cartographer_omni.py not yet implemented
# Manual alternative: use grep/find to count patterns across Go files
# Example: count error wrapping patterns
grep -rn 'fmt.Errorf.*%w' ~/repos/my-project --include="*.go" | wc -l
# Example: count constructor patterns
grep -rn 'func New' ~/repos/my-project --include="*.go" | wc -l

Always run the cartographer scripts for measurement; reserve LLM interpretation for Phase 3. When an LLM sees return err it may report "not wrapping errors properly" even if that IS the local standard. The scripts produce deterministic, reproducible counts; the LLM's role begins at interpretation in Phase 3.

Step 2: Verify output integrity

  • Confirm JSON output is valid and complete
  • Check file count matches expectations (no vendor pollution)
  • Verify all three lenses produced data
  • Confirm derived_rules section exists in output

Step 3: Check for data quality issues

  • File count suspiciously high? Vendor code may be included
  • File count suspiciously low? Subdirectories may be missed
  • All percentages near 50%? May indicate mixed codebase or insufficient data
===============================================================
 PHASE 2: MEASURE
===============================================================

 Script Executed: [cartographer_omni.py (not yet implemented — use manual pattern counting)]
 Target: [/path/to/repo]

 Results:
   - Files analyzed: [N]
   - Total lines: [N]
   - Categories measured: [N of 25]
   - Derived rules: [N auto-extracted]

 Data Quality:
   - [ ] JSON output valid
   - [ ] File count reasonable (no vendor pollution)
   - [ ] All three lenses have data
   - [ ] No unexpected zeros in major categories

 Output saved to: [path/to/output.json]

 MEASURE complete. Proceeding to INTERPRET...
===============================================================

Gate: Script completed without errors, JSON output is valid, file count is reasonable. Proceed only when gate passes.

Phase 3: INTERPRET

Goal: Derive rules from statistics. This is where LLM interpretation happens -- AFTER measurement is complete.

Report facts and show complete statistics rather than describing them. Report facts without editorializing about code quality -- the numbers speak for themselves.

Step 1: Review the three lenses

Lens Question Measures
Consistency (Frequency) "How often do they use X?" Imports, test frameworks, logging, modern features
Signature (Structure) "How do they name/structure things?" Constructors, receivers, parameter order, variables
Idiom (Implementation) "How do they implement patterns?" Error handling, control flow, context usage, defer

For detailed lens explanations, see references/three-lenses.md.

Step 2: Extract rules by confidence

Only derive rules from patterns with sufficient consistency. Forcing rules from weak patterns causes false positives in reviews and may impose standards the team has not organically adopted.

Confidence Threshold Action Example
HIGH >85% consistency Extract as enforceable rule "96% use err not e" -> MUST use err
MEDIUM 70-85% consistency Extract as recommendation "78% guard clauses" -> SHOULD prefer guards
Below 70% Not extracted as rule Report as observation only "55% single-letter receivers" -> No rule

Step 3: Review Style Vector (Omni only)

  • 10 composite scores (0-100): Consistency, Modernization, Safety, Idiomaticity, Documentation, Testing Maturity, Architecture, Performance, Observability, Production Readiness
  • Identify strengths (scores >75) and gaps (scores <50)
  • Note shadow constitution entries (accepted linter suppressions)

Step 4: Cross-reference lenses

  • Pattern confirmed across multiple lenses = higher confidence
  • Pattern in one lens only = standard confidence
  • Contradictions between lenses = investigate further

Gate: Rules extracted with evidence and confidence levels. Style Vector reviewed. Proceed only when gate passes.

Phase 4: DELIVER

Goal: Produce actionable output artifacts.

Step 1: Save statistical report

cartography_data/{repo_name}_cartography.json

Step 2: Generate derived rules document

derived_rules/{repo_name}_rules.md

Format each rule as:

markdown
## Rule: [Statement]
**Confidence**: HIGH/MEDIUM
**Evidence**: [X% consistency across N occurrences]
**Category**: [error_handling | naming | control_flow | architecture | ...]
**Lens**: [Consistency | Signature | Idiom | Multiple]

Step 3: Summarize Style Vector (Omni only)

markdown
## Style Vector Summary
| Dimension | Score | Assessment |
|-----------|-------|------------|
| Consistency | [0-100] | [Strength/Gap/Neutral] |
| Modernization | [0-100] | [Strength/Gap/Neutral] |
| ... | ... | ... |

Step 4: Recommend next steps

  • Compare with pr-workflow (miner) data if available (explicit vs implicit rules)
  • Suggest CLAUDE.md updates for high-confidence rules
  • Identify golangci-lint rules that could enforce discovered patterns
  • Suggest quarterly re-analysis schedule -- coding patterns evolve with team growth and new Go versions, so a one-time snapshot becomes stale within months
===============================================================
 PHASE 4: DELIVER
===============================================================

 Artifacts:
   - [ ] JSON report: [path]
   - [ ] Rules document: [path]
   - [ ] Style Vector summary: [included in rules doc]

 Results Summary:
   - HIGH confidence rules: [N]
   - MEDIUM confidence rules: [N]
   - Observations (below threshold): [N]
   - Style Vector overall: [strong/mixed/weak]

 Next Steps:
   1. [Specific recommendation]
   2. [Specific recommendation]
   3. [Specific recommendation]

 DELIVER complete. Analysis finished.
===============================================================

Gate: JSON report saved, rules document generated, next steps documented. Analysis complete.


Complementary Skills

Skill Extracts Combined Value
pr-workflow (miner) Explicit rules (what people argue about in reviews) Agreement = HIGH confidence; Silence + consistency = implicit rule
codebase-analyzer Implicit rules (what they actually do) pr-workflow (miner) says X but code does Y = rule not followed

Reconciliation Matrix

pr-workflow (miner) codebase-analyzer Conclusion
Says X Shows X at >85% Confirmed rule (both explicit and practiced)
Silent Shows X at >85% Implicit rule (nobody argues because everyone agrees)
Says X Shows Y at >85% Rule stated but not followed (needs enforcement or is outdated)
Mixed signals Inconsistent No standard yet (opportunity to establish one)

Examples

Example 1: Single Repository Analysis

User says: "What conventions does this repo follow?" Actions:

  1. Validate target has 100+ Go files (CONFIGURE)
  2. Run pattern counting against the repo (MEASURE)
  3. Extract rules from statistics: error wrapping 89%, guard clauses 5.2x, New{Type} 94% (INTERPRET)
  4. Save JSON report and rules document (DELIVER) Result: 30+ rules extracted with confidence levels, Style Vector produced

Example 2: Team-Wide Standards Discovery

User says: "Find our team's coding patterns across all services" Actions:

  1. Validate all target repos, confirm 50+ files each (CONFIGURE)
  2. Run cartographer on each repo separately (MEASURE)
  3. Cross-reference patterns: error wrapping 87-91% across all repos = team standard (INTERPRET)
  4. Produce team-wide rules document with per-repo breakdowns (DELIVER) Result: Team-wide standards with cross-repo evidence

Example 3: Onboarding New Developer

User says: "I just joined the team, what coding patterns should I follow?" Actions:

  1. Identify main team repos, validate Go file counts (CONFIGURE)
  2. Run omni-cartographer on primary service (MEASURE)
  3. Extract top 10 HIGH confidence rules as onboarding checklist (INTERPRET)
  4. Produce concise rules doc focusing on error handling, naming, and control flow (DELIVER) Result: Evidence-based onboarding guide with concrete examples from actual codebase

Error Handling

Error: "No Go files found"

Cause: Path does not point to a Go repository root, or .go files are in subdirectories not being scanned Solution:

  1. Verify path points to repository root with ls *.go or find . -name "*.go" | head
  2. If Go files are nested, point to parent directory
  3. Confirm vendor/ is not the only directory containing Go files

Error: "No rules derived"

Cause: Codebase too small (<50 files) or patterns genuinely inconsistent Solution:

  1. Check file count -- if <50, combine analysis across multiple repos from same team
  2. If >50 files but no rules, team genuinely lacks consistent patterns
  3. Lower threshold to 60% to find emerging patterns (note reduced confidence)

Error: "Statistics dominated by vendor/generated code"

Cause: Vendor directory or generated files not filtered, polluting pattern data Solution:

  1. Verify scripts are filtering vendor/, testdata/, and _test files for core patterns
  2. If non-standard structure, analyze specific directories manually
  3. Check for generated code markers (Code generated by...) and exclude those files

References

Reference Files

  • ${CLAUDE_SKILL_DIR}/references/three-lenses.md: Detailed explanation of the three analysis lenses
  • ${CLAUDE_SKILL_DIR}/references/examples.md: Real-world analysis examples and workflows
  • ${CLAUDE_SKILL_DIR}/references/metrics-catalog.md: Complete 100-metric catalog across 25 categories

Prerequisites

  • Python 3.7+
  • Go codebase to analyze (50+ files recommended)
  • No external dependencies (uses only Python standard library)

Expand your agent's capabilities with these related and highly-rated skills.

Didn't find tool you were looking for?

Be as detailed as possible for better results