Agent skill

error-coordinator

Expert in making multi-agent systems resilient. Specializes in detecting loops, hallucinations, and failures, and implementing self-healing workflows. Use when designing error handling for agent systems, implementing retry strategies, or building resilient AI workflows.

Stars 66
Forks 6

Install this agent skill to your Project

npx add-skill https://github.com/404kidwiz/claude-supercode-skills/tree/main/error-coordinator-skill

SKILL.md

Error Coordinator

Purpose

Provides expertise in building resilient multi-agent systems with robust error handling, failure detection, and recovery mechanisms. Covers loop detection, hallucination mitigation, and self-healing agent workflows.

When to Use

  • Designing error handling for agent systems
  • Implementing retry and recovery strategies
  • Building self-healing AI workflows
  • Detecting agent loops and infinite recursion
  • Mitigating hallucinations in agent outputs
  • Implementing circuit breakers for agents
  • Coordinating failure recovery across agents

Quick Start

Invoke this skill when:

  • Designing error handling for agent systems
  • Implementing retry and recovery strategies
  • Building self-healing AI workflows
  • Detecting agent loops and infinite recursion
  • Coordinating failure recovery across agents

Do NOT invoke when:

  • Organizing agent teams (use agent-organizer)
  • Debugging application errors (use debugger)
  • Handling production incidents (use incident-responder)
  • Detecting code error patterns (use error-detective)

Decision Framework

Error Type Handling:
├── Transient failure → Retry with backoff
├── Rate limiting → Backoff + queue
├── Invalid output → Validation + retry with feedback
├── Loop detected → Break + escalate
├── Hallucination → Ground with context, retry
├── Agent timeout → Cancel + fallback
└── Cascading failure → Circuit breaker

Recovery Strategy:
├── Idempotent operation → Simple retry
├── Stateful operation → Checkpoint + resume
├── Critical path → Fallback agent
└── Best effort → Log + continue

Core Workflows

1. Loop Detection System

  1. Track agent invocation history
  2. Detect repeated state patterns
  3. Set maximum iteration limits
  4. Implement escape hatch triggers
  5. Log loop occurrences for analysis
  6. Escalate to supervisor or human

2. Hallucination Mitigation

  1. Ground responses with source data
  2. Implement output validation
  3. Cross-check with retrieval
  4. Add confidence scoring
  5. Flag low-confidence outputs
  6. Provide feedback for retry

3. Circuit Breaker Implementation

  1. Track failure rates per agent
  2. Define failure threshold
  3. Open circuit on threshold breach
  4. Provide fallback behavior
  5. Implement half-open state for testing
  6. Close circuit on recovery
  7. Monitor and alert on breaker state

Best Practices

  • Implement timeouts for all agent calls
  • Use exponential backoff with jitter
  • Log all failures with full context
  • Design for graceful degradation
  • Test failure scenarios explicitly
  • Monitor error rates and patterns

Anti-Patterns

Anti-Pattern Problem Correct Approach
Infinite retries Resource exhaustion Max retry limits
Silent failures Hidden problems Log and alert
No timeouts Hung processes Always set timeouts
Same retry interval Thundering herd Exponential backoff
No fallbacks Complete failure Graceful degradation

Expand your agent's capabilities with these related and highly-rated skills.

404kidwiz/claude-supercode-skills

documentation-engineer

Technical documentation and knowledge management expert. Use when creating comprehensive documentation systems, improving developer knowledge sharing, or building documentation-driven development workflows.

66 6
Explore
404kidwiz/claude-supercode-skills

backend-developer

Comprehensive backend development for building production-ready server-side applications with multiple frameworks, databases, and deployment strategies. Use when building APIs, services, databases, or server infrastructure.

66 6
Explore
404kidwiz/claude-supercode-skills

powershell-5.1-expert

Expert in legacy Windows PowerShell 5.1. Specializes in WMI, ADSI, COM automation, and maintaining backward compatibility with Windows Server environments. Use for Windows-specific automation on legacy systems. Triggers include "PowerShell 5.1", "Windows PowerShell", "WMI", "ADSI", "COM object", "legacy PowerShell".

66 6
Explore
404kidwiz/claude-supercode-skills

qa-expert

Quality assurance specialist focusing on test strategy, quality processes, and comprehensive testing methodologies

66 6
Explore
404kidwiz/claude-supercode-skills

multi-agent-coordinator

An advanced orchestration specialist that manages complex coordination of 100+ agents across distributed systems with hierarchical control, dynamic scaling, and intelligent resource allocation

66 6
Explore
404kidwiz/claude-supercode-skills

tooling-engineer

Expert in building developer tools, CLI utilities, IDE extensions, and optimizing local development environments.

66 6
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results