Agent skill
verification-protocol
Trigger Pattern Always (used by all verifier agents) - Inject Into security-verifier agents (Phase 5)
Install this agent skill to your Project
npx add-skill https://github.com/PlamenTSV/plamen/tree/main/agents/skills/sui/verification-protocol
SKILL.md
Verification Protocol (Sui Move)
Trigger Pattern: Always (used by all verifier agents) Inject Into: security-verifier agents (Phase 5) Purpose: Prove hypotheses TRUE or FALSE using Sui Move test framework with
test_scenarioPoC code.
Evidence Source Tracking (MANDATORY)
CRITICAL: For EVERY piece of evidence used in verification, you MUST tag its source. Evidence from mocks or unverified external packages CANNOT support a REFUTED verdict.
Evidence Source Tags
| Tag | Meaning | Valid for REFUTED? |
|---|---|---|
| [PROD-ONCHAIN] | Production Sui object data (via Sui Explorer or RPC) | YES |
| [PROD-SOURCE] | Verified source from Sui Explorer / published package | YES |
| [PROD-PUBLISHED] | Test against published package bytecode | YES |
| [CODE] | Audited codebase (in-scope source) | YES |
| [MOCK] | Mock/test modules or objects | NO |
| [EXT-UNV] | External, unverified package behavior | NO |
| [DOC] | Documentation/spec only | NO (needs verification) |
Evidence Audit Table (REQUIRED in every verification output)
Before ANY verdict, fill this table:
### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|
| "External package returns X" | Mock module | [MOCK] | NO |
| "Object ownership is Y" | sources/module.move:123 | [CODE] | YES |
| "Shared object state is Z" | Sui Explorer object view | [PROD-ONCHAIN] | YES |
Mock Rejection Rule
AUTOMATIC OVERRIDE: If ANY evidence supporting REFUTED has tag [MOCK] or [EXT-UNV]:
- CANNOT return REFUTED
- MUST return CONTESTED
- Triggers production verification
Example:
## Verdict: REFUTED -> CONTESTED (mock evidence override)
### Evidence Audit
| Claim | Source | Tag | Valid? |
|-------|--------|-----|--------|
| "External module validates input" | test_helper.move:45 | [MOCK] | NO |
**Override reason**: REFUTED verdict relies on mock behavior at test_helper.move:45.
Production package behavior is UNVERIFIED. Must fetch published package source.
Pre-Verification Understanding
Before writing ANY test code, you MUST answer:
Question 1: What is the EXACT bug?
NOT: "Object ownership is wrong"
NOT: "Access control is missing"
NOT: "State is inconsistent"
YES: "Function [X] in module [Y] accepts shared object [Z] as `&mut` without
verifying caller holds [CapabilityType], allowing any address to mutate
field [W] at line [N]"
Question 2: What OBSERVABLE difference proves it?
NOT: "State changed"
NOT: "Object was modified"
YES: "Before exploit: pool.total_supply = 1000, attacker_balance = 0
After exploit: pool.total_supply = 1000, attacker_balance = 500
Expected: transaction should have aborted with ENotAuthorized"
Question 3: What is the EXACT assertion?
NOT: assert!(exploit_worked, 0)
YES: assert!(coin::value(&stolen_coin) > 0, ERR_EXPLOIT_FAILED)
OR: // Transaction should abort -- if it succeeds, the bug exists
OR: assert!(state_after.field != state_before.field, ERR_STATE_UNCHANGED)
If you cannot answer all three -> ASK FOR CLARIFICATION
Pre-PoC Feasibility Gates (MANDATORY)
Before writing test code, verify these two gates. If either FAILS, adjust the hypothesis.
Gate F1: Reachability
Trace a call path from a permissionless entry point to the vulnerable code.
- Entry point identified (public/external/entry function)
- Call path traced through intermediary functions
- All access checks on the path are passable by the attacker profile
If NO entry point reaches the vulnerable code → UNREACHABLE → FALSE_POSITIVE. If reachable only through a restricted path → document the restriction, adjust likelihood.
Gate F2: Math Bounds
Substitute real-world value domains into the expression that triggers the bug.
- Parameter domains identified (token decimals, max supply, TVL range, fee range, time bounds)
- Expression evaluated at worst-case feasible inputs
- Result crosses the bug threshold
If the bug requires values outside feasible domains → INFEASIBLE → FALSE_POSITIVE. If feasible only at extreme but realistic parameters → document the threshold, proceed with adjusted severity.
Both gates PASS → proceed to PoC. Either gate FAILS → document and stop.
Test File Templates
See
templates.mdin this directory for all Sui Move test templates (Templates 1-6: shared object mutation, capability theft, dynamic fields, object wrapping, PTB exploit, concurrent access).
Interpreting Results
Test PASSES -> Bug CONFIRMED
The assertion that "proves the bug" succeeded.
Test FAILS -> Check Why
| Failure | Meaning | Action |
|---|---|---|
| Abort with error code | Function validation rejected the action | Check if rejection IS the bug or a fix |
test_scenario::take_from_sender fails |
Object not at expected address | Check transfer logic in setup |
test_scenario::take_shared fails |
Shared object not published | Check initialization creates shared objects |
| Type mismatch | Wrong object type taken from scenario | Fix type parameters |
| Arithmetic abort (overflow/underflow) | Math operation failed | Check if this IS the bug or setup error |
| Borrow checker error (compile) | Cannot borrow object mutably | Restructure test to respect Move borrow rules |
Iteration Protocol
Attempt 1: Direct implementation of test strategy from hypothesis.
Attempt 2: Adjust parameters:
- Different coin amounts (larger/smaller, edge values like 0, 1, u64::MAX)
- Different transaction ordering (swap next_tx blocks)
- Different actor addresses
- Different object states (empty pool, full pool, single-user, multi-user)
Attempt 3: Re-examine assumptions:
- Are shared objects properly published in setup?
- Are capability objects at the right addresses?
- Is the module's initialization complete (all shared objects created)?
- Are type parameters correct (generic type instantiation)?
- Does the function require a
ClockorTxContextargument not provided?
After 5 attempts: If still fails -> FALSE_POSITIVE with documented reasoning.
Severity Determination
CRITICAL
- Direct fund theft (Coin drain from shared pools)
- Unauthorized admin capability acquisition
- Arbitrary package upgrade (if upgrade cap compromised and no timelock)
- No special prerequisites needed
- Attacker profits significantly
HIGH
- Fund loss with specific setup (object pre-creation, ordering dependency)
- Broken core functionality (deposits, withdrawals, swaps, liquidations)
- Shared object state corruption affecting all users
- Significant TVL at risk
MEDIUM
- Limited fund loss under specific conditions
- Object state corruption (non-fund data)
- Edge cases with real impact at design limits
- Dynamic field pollution affecting protocol behavior
- Moderate value at risk
LOW
- Negligible direct impact
- Extreme edge cases only
- Admin-controlled risk (with multisig governance)
- View function / event emission issues
- Stranded non-value objects
Exchange Rate Finding Severity (MANDATORY)
CRITICAL: Before assigning severity to ANY finding affecting share/asset ratios or exchange rates, you MUST complete this quantitative analysis.
Required Quantitative Analysis
For findings affecting exchange rates, fill in this table:
| Metric | Value | Source |
|---|---|---|
| Protocol TVL | [X SUI or USD] | Production or documented estimate |
| Attack cost | [Y] | Calculated from attack steps (gas, tokens, opportunity) |
| Attacker profit | [Z] | Calculated (extraction - cost) |
| Victim loss per user | [W] | Calculated per affected user |
| Affected user count | [N] | one / some / all |
| Profit ratio | [Z/Y] | Attacker profit / attack cost |
Severity Calculation
Step 1: Calculate total impact = W * N (victim loss * affected users) Step 2: Calculate profitability = Z/Y (attacker profit / cost) Step 3: Apply severity matrix:
| Total Impact | Profitability > 2x | Profitability 1-2x | Profitability < 1x |
|---|---|---|---|
| > $100,000 | CRITICAL | HIGH | HIGH |
| $10,000 - $100,000 | HIGH | HIGH | MEDIUM |
| $1,000 - $10,000 | HIGH | MEDIUM | MEDIUM |
| < $1,000 | MEDIUM | LOW | LOW |
What NOT to Do
- "This enables extraction" (qualitative, no numbers)
- "Attacker can profit significantly" (undefined)
- "Loss of funds possible" (unquantified)
What TO Do
- "Attacker profits 500,000 SUI ($500,000) from 1,000 SUI ($1,000) investment"
- "Each victim loses up to 2% of deposit value, affecting all pool users"
- "Total extractable value: $500,000 with 500x profit ratio -> CRITICAL"
Design Flaw Severity Escalation
When a finding is classified as a "design flaw" rather than an exploit, apply this escalation check:
| Criterion | YES/NO |
|---|---|
| Risk-free for the attacker (no capital at risk, or attacker profits even if partial) | |
| Repeatable (can be executed on every occurrence of a triggering event) | |
| Scales with protocol usage (impact grows with TVL, user count, or time) | |
| No mitigation without code change (off-chain monitoring cannot prevent, only detect) |
If ALL 4 criteria are YES: Severity floor = MEDIUM (cannot be rated LOW or Informational) If 3 of 4 criteria are YES: Recheck -- the remaining criterion may not actually block the attack at scale
Advanced Protocol Reference: See
advanced.mdfor RAG queries, RAG confidence override, chain hypothesis protection, Sui-specific testing considerations, dual-perspective verification, realistic parameter validation, anti-downgrade guard, new observations, error trace output, and bidirectional role analysis.
Output Format
CONFIRMED
## Verdict: CONFIRMED
### Bug Mechanism Verified
{Explain what the test_scenario test proves in 2-3 sentences}
### Test Code
{Full Move test function}
### Test Output
{Relevant assertions and logged values from `sui move test`}
### Key Evidence
| Metric | Value |
|--------|-------|
| Before | {value} |
| After | {value} |
| Expected | {value} |
| Difference | {calculation} |
### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|
### RAG Evidence
- **Attack Vectors Consulted**: [list]
- **Similar Exploits Found**: [count]
- **Historical Precedent**: [description]
### Severity: {LEVEL}
{Justification in 1-2 sentences}
FALSE_POSITIVE
## Verdict: FALSE_POSITIVE
### Attempts Made
**Attempt 1:**
- Approach: {description}
- Result: {what happened -- include abort codes}
- Learning: {insight}
**Attempt 2:**
- Approach: {description}
- Result: {what happened}
- Learning: {insight}
**Attempt 3:**
- Approach: {description}
- Result: {what happened}
- Learning: {insight}
### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|
### Why It Is Not a Bug
{Explain the actual behavior and why hypothesis was wrong in 2-3 sentences}
### Error Trace
- **Failure Type**: {type}
- **Location**: {location}
- **Error Code**: {code}
- **State at Failure**: {state}
- **Investigation Question**: {question}
CONTESTED
## Verdict: CONTESTED
### Evidence Status
| Checkpoint | Status | Details |
|------------|--------|---------|
| External package behavior verified against PRODUCTION | YES/NO | {details} |
| All entry functions checked | YES/NO | {details} |
| Object ownership model verified | YES/NO | {details} |
| Shared object access control confirmed | YES/NO | {details} |
### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|
### Why This Cannot Be REFUTED
{Explain what evidence is missing to definitively rule out the bug}
### Escalation Required
- [ ] Fetch published package source for {external dep}
- [ ] Dump production object state for {object}
- [ ] Check additional entry function paths: {list}
### Error Trace
- **Failure Type**: {type}
- **Location**: {location}
- **Error Code**: {code}
- **State at Failure**: {state}
- **Investigation Question**: {question}
Insufficient Evidence (HALT CONDITIONS)
Before marking REFUTED, check ALL boxes:
- External package behavior verified against PRODUCTION (not mock)
- Attack path checked on ALL public entry functions that access the same shared objects
- Profit calculated with attacker HOLDING tokens (not just transferring in)
- Missing precondition documented (type: STATE / ACCESS / TIMING / EXTERNAL / BALANCE)
- Searched other findings for matching postconditions (chain analysis integration)
- Object ownership verified in source (not assumed from naming)
- Capability access control verified for ALL shared object mutation paths
- Dynamic field access patterns verified (correct key types, no collisions)
Evidence That Does NOT Count
- "Mock module shows X" -- mocks are not production behavior
- "Standard Coin<T>" -- may be wrapped in custom module with hooks/restrictions
- "Attacker loses by sending coins" -- may profit via position held in pool
- "Function is
public(package)" -- may be callable via CPI from another module in the same package - "Requires AdminCap" -- AdminCap may have
storeability and be transferable - "Attacker cannot acquire X" -- another finding may CREATE this condition
- "Object is owned by admin" -- ownership may be transferable if object has
store
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
integration-hazard-research
Protocol Type Trigger NAMED_EXTERNAL_PROTOCOL (detected when recon finds import/interface for an identifiable external protocol — not standard libraries). Researches known integration hazards of the target protocol.
outcome-determinism
Protocol Type Trigger outcome_determinism - detected when EITHER of these code patterns are present - - Selection from finite depletable pool with fallback behavior (while(full)...
governance-attack-vectors
Protocol Type Trigger governance (detected when Governor, Timelock, voting, proposal, quorum, delegate patterns found) - Inject Into Breadth agents, depth-external, depth-edge-case
vault-accounting
Protocol Type Trigger vault (detected in recon TASK 0 Step 1) - Inject Into Core state agent OR economic design agent (merge via M4 hierarchy)
lending-protocol-security
Protocol Type Trigger lending (detected when recon finds liquidate|borrow|repay|collateral|lend|loan|LTV|healthFactor|interestRate|debtToken) - Inject Into Breadth agents, depth...
dex-integration-security
Protocol Type Trigger dex_integration (detected when recon finds swap|addLiquidity|removeLiquidity|IUniswapV2Router|ISwapRouter|amountOutMin|amountOutMinimum|slippage - AND the...
Didn't find tool you were looking for?