Agent skills
code-maturity-assessor

Agent skill

code-maturity-assessor

Systematic code maturity assessment using Trail of Bits' 9-category framework. Analyzes codebase for arithmetic safety, auditing practices, access controls, complexity, decentralization, documentation, MEV risks, low-level code, and testing. Produces professional scorecard with evidence-based ratings and actionable recommendations.

View SKILL.md on GitHub Repository

Stars 4,181

Forks 369

Install this agent skill to your Project

npx add-skill https://github.com/trailofbits/skills/tree/main/plugins/building-secure-contracts/skills/code-maturity-assessor

SKILL.md

Code Maturity Assessor

Purpose

Systematically assesses codebase maturity using Trail of Bits' 9-category framework. Provides evidence-based ratings and actionable recommendations.

Framework: Building Secure Contracts - Code Maturity Evaluation v0.1.0

How This Works

Phase 1: Discovery

Explores the codebase to understand:

Project structure and platform
Contract/module files
Test coverage
Documentation availability

Phase 2: Analysis

For each of 9 categories, I'll:

Search the code for relevant patterns
Read key files to assess implementation
Present findings with file references
Ask clarifying questions about processes I can't see in code
Determine rating based on criteria

Phase 3: Report

Generates:

Executive summary
Maturity scorecard (ratings for all 9 categories)
Detailed analysis with evidence
Priority-ordered improvement roadmap

Rating System

Missing (0): Not present/not implemented
Weak (1): Several significant improvements needed
Moderate (2): Adequate, can be improved
Satisfactory (3): Above average, minor improvements
Strong (4): Exceptional, only small improvements possible

Rating Logic:

ANY "Weak" criteria → Weak
NO "Weak" + SOME "Moderate" unmet → Moderate
ALL "Moderate" + SOME "Satisfactory" met → Satisfactory
ALL "Satisfactory" + exceptional practices → Strong

The 9 Categories

I assess 9 comprehensive categories covering all aspects of code maturity. For detailed criteria, analysis approaches, and rating thresholds, see ASSESSMENT_CRITERIA.md.

Quick Reference:

1. ARITHMETIC

Overflow protection mechanisms
Precision handling and rounding
Formula specifications
Edge case testing

2. AUDITING

Event definitions and coverage
Monitoring infrastructure
Incident response planning

3. AUTHENTICATION / ACCESS CONTROLS

Privilege management
Role separation
Access control testing
Key compromise scenarios

4. COMPLEXITY MANAGEMENT

Function scope and clarity
Cyclomatic complexity
Inheritance hierarchies
Code duplication

5. DECENTRALIZATION

Centralization risks
Upgrade control mechanisms
User opt-out paths
Timelock/multisig patterns

6. DOCUMENTATION

Specifications and architecture
Inline code documentation
User stories
Domain glossaries

7. TRANSACTION ORDERING RISKS

MEV vulnerabilities
Front-running protections
Slippage controls
Oracle security

8. LOW-LEVEL MANIPULATION

Assembly usage
Unsafe code sections
Low-level calls
Justification and testing

9. TESTING & VERIFICATION

Test coverage
Fuzzing and formal verification
CI/CD integration
Test quality

For complete assessment criteria including what I'll analyze, what I'll ask you, and detailed rating thresholds (WEAK/MODERATE/SATISFACTORY/STRONG), see ASSESSMENT_CRITERIA.md.

Example Output

When the assessment is complete, you'll receive a comprehensive maturity report including:

Executive Summary: Overall score, top 3 strengths, top 3 gaps, priority recommendations
Maturity Scorecard: Table with all 9 categories rated with scores and notes
Detailed Analysis: Category-by-category breakdown with evidence (file:line references)
Improvement Roadmap: Priority-ordered recommendations (CRITICAL/HIGH/MEDIUM) with effort estimates

For a complete example assessment report, see EXAMPLE_REPORT.md.

Assessment Process

When invoked, I will:

Explore codebase
- Find contract/module files
- Identify test files
- Locate documentation
Analyze each category
- Search for relevant code patterns
- Read key implementations
- Assess against criteria
- Collect evidence
Interactive assessment
- Present my findings with file references
- Ask about processes I can't see in code
- Discuss borderline cases
- Determine ratings together
Generate report
- Executive summary
- Maturity scorecard table
- Detailed category analysis with evidence
- Priority-ordered improvement roadmap

Rationalizations (Do Not Skip)

Rationalization	Why It's Wrong	Required Action
"Found some findings, assessment complete"	Assessment requires evaluating ALL 9 categories	Complete assessment of all 9 categories with evidence for each
"I see events, auditing category looks good"	Events alone don't equal auditing maturity	Check logging comprehensiveness, testing, incident response processes
"Code looks simple, complexity is low"	Visual simplicity masks composition complexity	Analyze cyclomatic complexity, dependency depth, state machine transitions
"Not a DeFi protocol, MEV category doesn't apply"	MEV extends beyond DeFi (governance, NFTs, games)	Verify with transaction ordering analysis before declaring N/A
"No assembly found, low-level category is N/A"	Low-level risks include external calls, delegatecall, inline assembly	Search for all low-level patterns before skipping category
"This is taking too long"	Thorough assessment requires time per category	Complete all 9 categories, ask clarifying questions about off-chain processes
"I can rate this without evidence"	Ratings without file:line references = unsubstantiated claims	Collect concrete code evidence for every category assessment
"User will know what to improve"	Vague guidance = no action	Provide priority-ordered roadmap with specific improvements and effort estimates

Report Format

For detailed report structure and templates, see REPORT_FORMAT.md.

Structure:

Executive Summary
- Project name and platform
- Overall maturity (average rating)
- Top 3 strengths
- Top 3 critical gaps
- Priority recommendations
Maturity Scorecard
- Table with all 9 categories
- Ratings and scores
- Key findings notes
Detailed Analysis
- Per-category breakdown
- Evidence with file:line references
- Gaps and improvement actions
Improvement Roadmap
- CRITICAL (immediate)
- HIGH (1-2 months)
- MEDIUM (2-4 months)
- Effort estimates and impact

Ready to Begin

Estimated Time: 30-40 minutes

I'll need:

Access to full codebase
Your knowledge of processes (monitoring, incident response, team practices)
Context about the project (DeFi, NFT, infrastructure, etc.)

Let's assess this codebase!

Maintainer

trailofbits Core maintainer

Source details

Full Name: trailofbits/skills
Branch: main
Path in repo: plugins/building-secure-contracts/skills/code-maturity-assessor
License: Creative Commons Attribution Share Alike 4.0 International
Topics: agent-skills

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

trailofbits/skills

gh-cli

Enforces authenticated gh CLI workflows over unauthenticated curl/WebFetch patterns. Use when working with GitHub URLs, API access, pull requests, or issues.

4,181 369

Explore

trailofbits/skills

supply-chain-risk-auditor

Identifies dependencies at heightened risk of exploitation or takeover. Use when assessing supply chain attack surface, evaluating dependency health, or scoping security engagements.

4,181 369

Explore

trailofbits/skills

zeroize-audit

Detects missing zeroization of sensitive data in source code and identifies zeroization removed by compiler optimizations, with assembly-level analysis, and control-flow verification. Use for auditing C/C++/Rust code handling secrets, keys, passwords, or other sensitive data.

4,181 369

Explore

trailofbits/skills

sharp-edges

Identifies error-prone APIs, dangerous configurations, and footgun designs that enable security mistakes. Use when reviewing API designs, configuration schemas, cryptographic library ergonomics, or evaluating whether code follows 'secure by default' and 'pit of success' principles. Triggers: footgun, misuse-resistant, secure defaults, API usability, dangerous configuration.

4,181 369

Explore

trailofbits/skills

insecure-defaults

Detects fail-open insecure defaults (hardcoded secrets, weak auth, permissive security) that allow apps to run insecurely in production. Use when auditing security, reviewing config management, or analyzing environment variable handling.

4,181 369

Explore

trailofbits/skills

dwarf-expert

Provides expertise for analyzing DWARF debug files and understanding the DWARF debug format/standard (v3-v5). Triggers when understanding DWARF information, interacting with DWARF files, answering DWARF-related questions, or working with code that parses DWARF data.

4,181 369

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Code Maturity Assessor

Purpose

How This Works

Phase 1: Discovery

Phase 2: Analysis

Phase 3: Report

Rating System

The 9 Categories

Quick Reference:

Example Output

Assessment Process

Rationalizations (Do Not Skip)

Report Format

Structure:

Ready to Begin

Recommended Agent Skills

gh-cli

supply-chain-risk-auditor

zeroize-audit

sharp-edges

insecure-defaults

dwarf-expert