Agent Performance Dashboard

Purpose

Provides visibility into agent usage patterns to optimize delegation and identify improvement opportunities.

When I Activate

I automatically load when you mention:

"agent performance" or "agent metrics"
"agent dashboard" or "agent usage"
"which agents are used" or "underutilized agents"
"agent success rate" or "agent statistics"

What I Do

Track Invocations: Record agent usage via workflow tracker
Measure Success: Track completion rates per agent
Analyze Patterns: Identify usage trends and gaps
Generate Reports: Create actionable dashboards

Quick Start

User: "Show me agent performance metrics"
Skill: *activates automatically*
       "Generating agent performance report..."

Core Capabilities

1. Report Generation

Generate a performance report by reading workflow logs and aggregating agent metrics:

User: "Generate agent performance report"

Report includes:

Invocation counts per agent
Success/failure rates
Average completion times (when tracked)
Underutilized agents list
Recommendations for optimization

2. Live Tracking

Track agent invocations during workflow execution using the existing workflow_tracker:

python

# Already available in .claude/tools/amplihack/hooks/workflow_tracker.py
from workflow_tracker import log_agent_invocation

log_agent_invocation(
    agent_name="architect",
    purpose="Design authentication module",
    step_number=2
)

3. Metrics Storage

Metrics are stored in:

Raw logs: .claude/runtime/logs/workflow_adherence/workflow_execution.jsonl
Aggregated: .claude/runtime/metrics/agent_performance.yaml

Report Format

Summary Dashboard

yaml

# Agent Performance Summary
# Generated: 2025-11-25

total_invocations: 142

agents:
  architect:
    invocations: 45
    success_rate: 95.6%
    avg_duration_ms: 2340
    trend: increasing

  builder:
    invocations: 38
    success_rate: 89.5%
    avg_duration_ms: 4520
    trend: stable

  reviewer:
    invocations: 25
    success_rate: 100%
    avg_duration_ms: 1890
    trend: increasing

underutilized:
  - database (0 invocations in last 30 days)
  - integration (2 invocations in last 30 days)
  - patterns (3 invocations in last 30 days)

recommendations:
  - Consider using database agent for schema work
  - Integration agent available for external service connections
  - Patterns agent can identify reusable solutions

Implementation Guide

To Generate a Report

Read workflow execution logs:

Read: .claude/runtime/logs/workflow_adherence/workflow_execution.jsonl

Filter for agent_invoked events:

json

{ "event": "agent_invoked", "agent": "architect", "purpose": "...", "step": 2 }

Aggregate by agent name:
- Count invocations
- Calculate success rates from workflow_end events
- Compute average durations
Identify underutilized agents:
- List all available agents from .claude/agents/amplihack/
- Compare against invocation counts
- Flag agents with <5 invocations in analysis period

Write report to:

.claude/runtime/metrics/agent_performance.yaml

Available Agents Inventory

Core Agents (6):

architect, builder, reviewer, tester, optimizer, api-designer

Specialized Agents (25):

ambiguity, amplifier-cli-architect, analyzer, azure-kubernetes-expert
ci-diagnostic-workflow, cleanup, database, documentation-writer
fallback-cascade, fix-agent, integration, knowledge-archaeologist
memory-manager, multi-agent-debate, n-version-validator, patterns
philosophy-guardian, pre-commit-diagnostic, preference-reviewer
prompt-writer, rust-programming-expert, security, visualization-architect
worktree-manager, xpia-defense

Note: Agent count may change as specialized agents are added/removed. Use ls .claude/agents/amplihack/specialized/ for current count.

Tracking Best Practices

When Invoking Agents

Always log invocations for accurate tracking:

python

# Before invoking an agent via Task tool
log_agent_invocation(
    agent_name="security",
    purpose="Audit authentication implementation",
    step_number=7  # Optional: link to workflow step
)

# Then invoke the agent
Task(subagent_type="security", prompt="...")

Workflow Integration

The DEFAULT_WORKFLOW.md specifies agent delegation at each step. This skill helps verify adherence:

Step 1: prompt-writer
Step 2: architect
Step 3: builder
Step 4: tester
Step 5: reviewer
etc.

Configuration

Setting	Default	Description
`ANALYSIS_DAYS`	30	Days of history to analyze
`UNDERUTILIZED_THRESHOLD`	5	Invocations below this = underutilized
`METRICS_FILE`	`agent_performance.yaml`	Output file name

Philosophy Alignment

This skill follows:

Ruthless Simplicity: Uses existing infrastructure (workflow_tracker)
Zero-BS: No placeholders, working aggregation logic
Modular Design: Self-contained skill, clear boundaries
Emergence: Insights emerge from simple tracking patterns

Interpreting Metrics

Success Rate Guidelines

Rate	Assessment	Action
95-100%	Excellent	Maintain current patterns
85-94%	Good	Review occasional failures for patterns
70-84%	Needs Attention	Investigate failure causes, adjust prompts
Below 70%	Critical	Agent may need redesign or prompt overhaul

Invocation Volume Interpretation

High volume (30+ in 30 days): Core workflow agent, ensure reliability
Medium volume (10-29): Regular use, monitor for optimization opportunities
Low volume (5-9): Specialized use case, verify still needed
Very low (<5): Consider if agent is discoverable or relevant

Duration Benchmarks

< 2 seconds: Fast execution, typical for simple analysis
2-10 seconds: Normal for moderate complexity
10-60 seconds: Expected for deep analysis or multi-step tasks
> 60 seconds: May indicate inefficiency, consider optimization

Empty State Handling

When no log data exists (new project or logs cleared):