Agent skills
silent-degradation-audit

Agent skill

silent-degradation-audit

Production-ready skill for detecting silent degradation across codebases. Uses multi-wave audit system with 6 specialized category agents, multi-agent validation panel, and convergence detection.

View SKILL.md on GitHub Repository

Stars 45

Forks 28

Install this agent skill to your Project

npx add-skill https://github.com/rysweet/amplihack/tree/main/.claude/skills/silent-degradation-audit

SKILL.md

Silent Degradation Audit Skill

Overview

Production-ready skill for detecting silent degradation across codebases. Uses multi-wave audit system with 6 specialized category agents, multi-agent validation panel, and convergence detection. Battle-tested on CyberGym codebase (~250 bugs found).

When to Use This Skill

Use this skill when:

Code has reliability issues but unclear where
Systems fail silently without operator visibility
Error handling exists but effectiveness unknown
Need comprehensive audit across multiple failure modes
Preparing for production deployment
Post-mortem analysis after silent failures

Don't use for:

Code style or formatting issues (use linters)
Performance optimization (use profilers)
Security vulnerabilities (use security scanners)
Simple one-off code reviews (use /analyze)

Key Features

Multi-Wave Progressive Audit

Wave 1: Broad scan, finds obvious issues (40-50% of total)
Wave 2-3: Deeper analysis, finds hidden issues (30-40%)
Wave 4-6: Edge cases and subtleties (10-20%)
Convergence: Stops when < 10 new findings or < 5% of Wave 1

6 Category Agents

Dependency Failures (Category A): "What happens when X is down?"
Config Errors (Category B): "What happens when config is wrong?"
Background Work (Category C): "What happens when background work fails?"
Test Effectiveness (Category D): "Do tests actually detect failures?"
Operator Visibility (Category E): "Is the error visible to operators?"
Functional Stubs (Category F): "Does this code actually do what its name says?"

Multi-Agent Validation Panel

3 agents review findings: Security, Architect, Builder
2/3 consensus required to validate finding
Prevents false positives and unnecessary changes
Tracks strong vs weak consensus

Language-Agnostic

Supports 9 languages with language-specific patterns:

Python, JavaScript, TypeScript
Rust, Go, Java, C#
Ruby, PHP

Integration Modes

Standalone Invocation

Direct skill invocation for focused audit:

/silent-degradation-audit path/to/codebase

Sub-Loop in Quality Audit Workflow

Integrated as Phase 2 of quality-audit-workflow:

quality-audit-workflow calls silent-degradation-audit
→ Returns findings to quality workflow
→ Quality workflow applies fixes
→ Continues to next phase

Usage

Basic Usage

bash

# Audit entire codebase
/silent-degradation-audit .

# Audit specific directory
/silent-degradation-audit ./src

# With custom exclusions
/silent-degradation-audit . --exclusions .my-exclusions.json

Configuration

Create .silent-degradation-config.json in codebase root:

json

{
  "convergence": {
    "absolute_threshold": 10,
    "relative_threshold": 0.05
  },
  "max_waves": 6,
  "exclusions": {
    "patterns": ["*.test.js", "test_*.py", "**/__tests__/**"]
  },
  "categories": {
    "enabled": [
      "dependency-failures",
      "config-errors",
      "background-work",
      "test-effectiveness",
      "operator-visibility",
      "functional-stubs"
    ]
  }
}

Exclusion Lists

Global Exclusions

Edit ~/.amplihack/.claude/skills/silent-degradation-audit/exclusions-global.json:

json

[
  {
    "pattern": "*.test.*",
    "reason": "Test files excluded from production audits",
    "category": "*"
  },
  {
    "pattern": "**/vendor/**",
    "reason": "Third-party code",
    "category": "*"
  }
]

Repository-Specific Exclusions

Create .silent-degradation-exclusions.json in repository root:

json

[
  {
    "pattern": "src/legacy/*.py",
    "reason": "Legacy code being replaced",
    "category": "*",
    "wave": 1
  },
  {
    "pattern": "api/endpoints.py:42",
    "reason": "Empty dict is valid API response",
    "category": "functional-stubs",
    "type": "exact"
  }
]

Output

Report Format

Generates .silent-degradation-report.md:

markdown

# Silent Degradation Audit Report

## Summary

- **Total Waves**: 4
- **Total Findings**: 137
- **Converged**: Yes
- **Convergence Ratio**: 4.2%

## Convergence Progress

Wave 1: ██████████████████████████████████████████████████ 120
Wave 2: ███████████████████████████ 65 (54.2% of Wave 1)
Wave 3: ████████ 18 (15.0% of Wave 1)
Wave 4: ██ 5 (4.2% of Wave 1)

Status: ✓ CONVERGED
Reason: Relative threshold met: 4.2% < 5.0%

## Findings by Category

### dependency-failures (42 findings)

- High: 15
- Medium: 20
- Low: 7

[... continues for all 6 categories ...]

Findings Format

Generates .silent-degradation-findings.json:

json

[
  {
    "id": "dep-001",
    "category": "dependency-failures",
    "severity": "high",
    "file": "src/payments.py",
    "line": 89,
    "description": "Payment API failure silently falls back to mock",
    "impact": "Production system using mock payments, no real charges",
    "visibility": "None - no logs or metrics",
    "recommendation": "Add explicit failure logging and metric, or fail fast",
    "wave": 1,
    "validation": {
      "result": "VALIDATED",
      "consensus": "strong",
      "votes": {
        "security": "APPROVE",
        "architect": "APPROVE",
        "builder": "APPROVE"
      }
    }
  },
  ...
]

Workflow Details

Phase 1: Initialization

Create convergence tracker with thresholds
Initialize exclusion manager
Set up audit state

Phase 2: Language Detection

Scan codebase for file extensions
Identify languages (> 5 files or > 5% threshold)
Load language-specific patterns

Phase 3: Load Exclusions

Load global exclusions from skill directory
Load repository-specific exclusions
Merge into single exclusion list

Phase 4: Wave Loop

For each wave (until convergence):

Category Analysis (6 agents in parallel)
- Each agent scans for category-specific issues
- Uses language-specific patterns
- Excludes previous findings
Validation Panel (3 agents in parallel)
- Security agent reviews security implications
- Architect agent reviews design impact
- Builder agent reviews implementation feasibility
Vote Tallying
- Require 2/3 consensus (APPROVE)
- Track strong vs weak consensus
- Flag inconclusive for human review
Exclusion Filtering
- Apply global and repo-specific exclusions
- Filter out duplicates
State Update
- Add new findings to total
- Record wave metrics
Convergence Check
- Absolute: < 10 new findings
- Relative: < 5% of Wave 1 findings
- Break if converged

Phase 5: Report Generation

Generate convergence plot
Calculate metrics summary
Categorize findings by type and severity
Write markdown report
Write JSON findings

Architecture

Directory Structure

.claude/skills/silent-degradation-audit/
├── SKILL.md                    # This file
├── reference.md                # Detailed patterns and examples
├── examples.md                 # Usage examples
├── patterns.md                 # Language-specific patterns
├── README.md                   # Quick start
├── category_agents/            # 6 category agent definitions
│   ├── dependency-failures.md
│   ├── config-errors.md
│   ├── background-work.md
│   ├── test-effectiveness.md
│   ├── operator-visibility.md
│   └── functional-stubs.md
├── validation_panel/           # Validation panel specs
│   ├── panel-spec.md
│   └── voting-rules.md
├── recipe/                     # Recipe-based workflow
│   └── audit-workflow.yaml
└── tools/                      # Python utilities
    ├── exclusion_manager.py
    ├── language_detector.py
    ├── convergence_tracker.py
    └── __init__.py

Component Responsibilities

Category Agents:

Scan codebase for category-specific issues
Use language-specific patterns
Produce findings with severity, impact, recommendation

Validation Panel:

Review findings from multiple perspectives
Vote APPROVE/REJECT/ABSTAIN
Require 2/3 consensus

Convergence Tracker:

Track findings per wave
Calculate convergence metrics
Determine when to stop

Exclusion Manager:

Load and merge exclusion lists
Filter findings against patterns
Add new exclusions

Language Detector:

Identify languages in codebase
Load language-specific patterns
Support 9 languages

Best Practices

Running First Audit

Start with small scope: Audit single service/module first
Review Wave 1 carefully: Establishes baseline
Tune exclusions: Add false positives to exclusion list
Verify fixes: Test fixes before applying broadly

Exclusion Management

When to add exclusions:

False positives (finding not actually an issue)
Intentional design (behavior is correct as-is)
Legacy code (not worth fixing right now)
Third-party code (can't modify)

When NOT to add exclusions:

Real issues you don't want to fix
Issues without time to fix now
Issues that seem hard

Better approach: Fix real issues, prioritize by severity.

Validation Tuning

If too many false positives:

Review validation panel prompts
Increase consensus threshold (require unanimous)
Add category-specific validation rules

If missing real issues:

Review category agent patterns
Add language-specific patterns
Decrease consensus threshold (1/3 approval)

Wave Management

Typical wave characteristics:

Wave 1: 40-50% of findings (obvious issues)
Wave 2: 25-30% (deeper issues)
Wave 3: 15-20% (subtle issues)
Wave 4+: < 10% each (edge cases)

If waves not converging:

Check for duplicate findings (exclusion not working)
Review category agent overlap (agents finding same things)
Consider lowering convergence threshold

Metrics and Monitoring

Success Metrics

Track these over time:

Audit Success:
- Convergence reached: Yes/No
- Waves to convergence: 4 (target: 3-5)
- Total findings: 137 (varies by codebase)
- Validation rate: 75% (target: 60-80%)

Finding Distribution:
- High severity: 15% (target: < 20%)
- Medium severity: 45% (target: 40-60%)
- Low severity: 40% (target: 30-50%)

Panel Effectiveness:
- Strong consensus: 60% (target: > 50%)
- Weak consensus: 30% (target: 20-40%)
- Inconclusive: 10% (target: < 10%)
- Abstention rate: 5% (target: < 10%)

Quality Indicators

Healthy audit:

Converges in 3-5 waves
Validation rate 60-80%
Strong consensus > 50%
Abstention rate < 10%

Warning signs:

Doesn't converge after 6 waves (agents finding same things)
Validation rate > 95% (rubber stamping)
Validation rate < 40% (too strict)
Inconclusive rate > 20% (poor context)

Troubleshooting

"Audit not converging"

Symptoms: Reaches max waves without convergence

Causes:

Category agents finding duplicate issues
Exclusion filtering not working
Convergence threshold too tight

Solutions:

Review findings for duplicates
Check exclusion patterns are matching
Increase relative threshold to 10%
Reduce max waves to 5

"Too many false positives"

Symptoms: Validation rate > 95%, many non-issues

Causes:

Category agents too aggressive
Validation panel too permissive
Patterns not tuned for codebase

Solutions:

Review category agent patterns
Add exclusions for false positive patterns
Require unanimous validation (3/3)
Tune language-specific patterns

"Missing real issues"

Symptoms: Known issues not in findings

Causes:

Category agent gaps
Exclusion too broad
Validation panel too strict

Solutions:

Check if issue matches any category
Review exclusion list for overly broad patterns
Lower consensus threshold to 1/3
Add specific patterns for missed issues

"Validation panel abstaining"

Symptoms: High abstention rate (> 20%)

Causes:

Insufficient context in findings
Agent prompts unclear
Findings outside agent expertise

Solutions:

Include more code context in findings
Review and improve agent prompts
Add fourth "generalist" agent
Improve finding descriptions

Advanced Configuration

Custom Category Agents

Create custom category agent in category_agents/custom.md:

markdown

# Category Custom: My Special Cases

## Core Question

"What happens when [specific scenario]?"

## Detection Focus

[Patterns to detect...]

## Language-Specific Patterns

[Language examples...]

Then enable in config:

json

{
  "categories": {
    "enabled": [
      "dependency-failures",
      "config-errors",
      "background-work",
      "test-effectiveness",
      "operator-visibility",
      "functional-stubs",
      "custom"
    ]
  }
}

Custom Validation Panel

Override validation panel with different agents:

yaml

# In recipe/audit-workflow.yaml
validation_panel:
  agents:
    - security
    - architect
    - builder
    - domain-expert # Add domain-specific agent

  consensus:
    required: 0.75 # Require 3/4 approval

Staged Rollout

Audit codebase incrementally:

bash

# Phase 1: Critical services only
/silent-degradation-audit ./services/payments ./services/auth

# Phase 2: All services
/silent-degradation-audit ./services

# Phase 3: Full codebase
/silent-degradation-audit .

Changelog

Version 1.0.0 (2025-02-24)

Initial release
6 category agents (A-F)
Multi-agent validation panel (2/3 consensus)
Convergence detection (dual thresholds)
Language-agnostic (9 languages)
Battle-tested on CyberGym (~250 bugs)
Integration modes: standalone + sub-loop

Maintainer

rysweet Core maintainer

Source details

Full Name: rysweet/amplihack
Branch: main
Path in repo: .claude/skills/silent-degradation-audit

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

rysweet/amplihack

chemist-analyst

Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.

45 28

Explore

rysweet/amplihack

learning-path-builder

Creates personalized learning paths for technologies, frameworks, or concepts. Use for user-interactive session only for onboarding new technologies, hackathon skill-building, or personal development planning. Not for use in automated development or investigation. Sequences resources (docs, tutorials, exercises) based on current skill level and learning goals. Adapts to learning style: hands-on, theory-first, project-based.

45 28

Explore

rysweet/amplihack

gh-work-report

Generates comprehensive GitHub activity reports across all authenticated accounts. Gathers repos, PRs, features, and themes for configurable time periods (1/5/7/30/90 days). Produces shareable markdown with tables, mermaid charts, and executive summaries. Can create a private repo with GitHub Actions automation and GitHub Pages aggregation site. Use when: "github report", "work report", "activity summary", "what did I work on", "gh-work-report", "show my github activity".

45 28

Explore

rysweet/amplihack

pr-review-assistant

Philosophy-aware PR reviews checking alignment with amplihack principles. Use when reviewing PRs to ensure ruthless simplicity, modular design, and zero-BS implementation. Suggests simplifications, identifies over-engineering, verifies brick module structure. Posts detailed, constructive review comments with specific file:line references.

45 28

Explore

rysweet/amplihack

code-smell-detector

Identifies anti-patterns specific to amplihack philosophy. Use when reviewing code for quality issues or refactoring. Detects: over-abstraction, complex inheritance, large functions (>50 lines), tight coupling, missing __all__ exports. Provides specific fixes and explanations for each smell.

45 28

Explore

rysweet/amplihack

biologist-analyst

Analyzes living systems and biological phenomena through biological lens using evolution, molecular biology, ecology, and systems biology frameworks. Provides insights on mechanisms, adaptations, interactions, and life processes. Use when: Biological systems, health issues, evolutionary questions, ecological problems, biotechnology. Evaluates: Function, structure, heredity, evolution, interactions, molecular mechanisms.

45 28

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Silent Degradation Audit Skill

Overview

When to Use This Skill

Key Features

Multi-Wave Progressive Audit

6 Category Agents

Multi-Agent Validation Panel

Language-Agnostic

Integration Modes

Standalone Invocation

Sub-Loop in Quality Audit Workflow

Usage

Basic Usage

Configuration

Exclusion Lists

Global Exclusions

Repository-Specific Exclusions

Output

Report Format

Findings Format

Workflow Details

Phase 1: Initialization

Phase 2: Language Detection

Phase 3: Load Exclusions

Phase 4: Wave Loop

Phase 5: Report Generation

Architecture

Directory Structure

Component Responsibilities

Best Practices

Running First Audit

Exclusion Management

Validation Tuning

Wave Management

Metrics and Monitoring

Success Metrics

Quality Indicators

Troubleshooting

"Audit not converging"

"Too many false positives"

"Missing real issues"

"Validation panel abstaining"

Advanced Configuration

Custom Category Agents

Custom Validation Panel

Staged Rollout

See Also

Changelog

Version 1.0.0 (2025-02-24)

Recommended Agent Skills

chemist-analyst

learning-path-builder

gh-work-report

pr-review-assistant

code-smell-detector

biologist-analyst