Agent skill
multi-agent-systems
Design and implement multi-agent LLM architectures using the orchestrator-subagent pattern. Use when: (1) Deciding whether to use multi-agent vs single-agent systems, (2) Implementing context isolation for high-volume operations, (3) Parallelizing independent research tasks, (4) Creating specialized agents with focused tool sets, (5) Building verification subagents for quality assurance, or (6) Analyzing context-centric decomposition boundaries.
Install this agent skill to your Project
npx add-skill https://github.com/hexbee/hello-skills/tree/main/skills/multi-agent-systems
SKILL.md
Multi-Agent Systems
When to Use Multi-Agent Architectures
Multi-agent systems introduce overhead. Every additional agent represents another potential point of failure, another set of prompts to maintain, and another source of unexpected behavior.
Multi-agent systems use 3-10x more tokens than single-agent approaches due to:
- Duplicating context across agents
- Coordination messages between agents
- Summarizing results for handoffs
Start with a Single Agent
A well-designed single agent with appropriate tools can accomplish far more than expected. Use single agent when:
- Tasks are sequential and context-dependent
- Tool count is under 15-20
- No clear benefit from parallelization
Three Cases Where Multi-Agent Excels
- Context pollution - Subtasks generate >1000 tokens but most info is irrelevant to main task
- Parallelization - Tasks can run independently and explore larger search space
- Specialization - Different tasks need different tools, prompts, or domain expertise
Decision Framework
Context Protection Pattern
Use when subtasks generate large context but only summary is needed for main task.
Example: Customer Support
class OrderLookupAgent:
def lookup_order(self, order_id: str) -> dict:
messages = [{"role": "user", "content": f"Get essential details for order {order_id}"}]
response = client.messages.create(
model="claude-sonnet-4-5", max_tokens=1024,
messages=messages, tools=[get_order_details_tool]
)
return extract_summary(response) # Returns 50-100 tokens, not 2000+
class SupportAgent:
def handle_issue(self, user_message: str):
if needs_order_info(user_message):
order_id = extract_order_id(user_message)
order_summary = OrderLookupAgent().lookup_order(order_id)
context = f"Order {order_id}: {order_summary['status']}, purchased {order_summary['date']}"
# Main agent gets clean context
Best when:
- Subtask generates >1000 tokens, most irrelevant
- Subtast is well-defined with clear extraction criteria
- Lookup/retrieval operations need filtering before use
Parallelization Pattern
Use when exploring larger search space or independent research facets.
import asyncio
from anthropic import AsyncAnthropic
client = AsyncAnthropic()
async def research_topic(query: str) -> dict:
facets = await lead_agent.decompose_query(query)
tasks = [research_subagent(facet) for facet in facets]
results = await asyncio.gather(*tasks)
return await lead_agent.synthesize(results)
async def research_subagent(facet: str) -> dict:
messages = [{"role": "user", "content": f"Research: {facet}"}]
response = await client.messages.create(
model="claude-sonnet-4-5", max_tokens=4096,
messages=messages, tools=[web_search, read_document]
)
return extract_findings(response)
Benefit: Thoroughness, not speed. Covers more ground at higher token cost.
Specialization Patterns
Tool Set Specialization
Split by domain when agent has 20+ tools, shows domain confusion, or degraded performance.
Signs you need specialization:
- Quantity: 20+ tools
- Domain confusion: Tools span unrelated domains
- Degraded performance: New tools hurt existing tasks
System Prompt Specialization
Different tasks require conflicting behavioral modes:
- Customer support: empathetic, patient
- Code review: precise, critical
- Compliance: rigid rule-following
- Brainstorming: creative flexibility
Domain Expertise Specialization
Deep domain context that would overwhelm a generalist:
- Legal analysis: case law, regulatory frameworks
- Medical research: clinical trial methodology
Multi-Platform Integration Example
class CRMAgent:
system_prompt = """You are a CRM specialist. You manage contacts,
opportunities, and account records. Always verify record ownership
before updates and maintain data integrity across related records."""
tools = [crm_get_contacts, crm_create_opportunity] # 8-10 CRM tools
class MarketingAgent:
system_prompt = """You are a marketing automation specialist. You
manage campaigns, lead scoring, and email sequences."""
tools = [marketing_get_campaigns, marketing_create_lead] # 8-10 tools
class OrchestratorAgent:
def execute(self, user_request: str):
response = client.messages.create(
model="claude-sonnet-4-5", max_tokens=1024,
system="""Route to appropriate specialist:
- CRM: Contacts, opportunities, accounts, sales pipeline
- Marketing: Campaigns, lead nurturing, email sequences""",
messages=[{"role": "user", "content": user_request}],
tools=[delegate_to_crm, delegate_to_marketing]
)
return response
Context-Centric Decomposition
Problem-centric (counterproductive): Split by work type (writer, tester, reviewer) - creates coordination overhead, context loss at handoffs.
Context-centric (effective): Agent handling a feature also handles its tests - already has necessary context.
Effective Boundaries
- Independent research paths
- Separate components with clean API contracts
- Blackbox verification
Problematic Boundaries
- Sequential phases of same work
- Tightly coupled components
- Work requiring shared state
Verification Subagent Pattern
Dedicated agent for testing/validating main agent's work. Succeeds because verification requires minimal context transfer.
class CodingAgent:
def implement_feature(self, requirements: str) -> dict:
response = client.messages.create(
model="claude-sonnet-4-5", max_tokens=4096,
messages=[{"role": "user", "content": f"Implement: {requirements}"}],
tools=[read_file, write_file, list_directory]
)
return {"code": response.content, "files_changed": extract_files(response)}
class VerificationAgent:
def verify_implementation(self, requirements: str, files_changed: list) -> dict:
messages = [{"role": "user", "content": f"""
Requirements: {requirements}
Files changed: {files_changed}
Run the complete test suite and verify:
1. All existing tests pass
2. New functionality works as specified
3. No obvious errors or security issues
You MUST run: pytest --verbose
Only mark as PASSED if ALL tests pass with no failures.
"""}]
response = client.messages.create(
model="claude-sonnet-4-5", max_tokens=4096,
messages=messages, tools=[run_tests, execute_code, read_file]
)
return {"passed": extract_pass_fail(response), "issues": extract_issues(response)}
Mitigate "Early Victory Problem"
Verifier marks passing without thorough testing. Prevention:
- Concrete criteria: "Run full test suite" not "make sure it works"
- Comprehensive checks: Test multiple scenarios and edge cases
- Negative tests: Confirm inputs that should fail do fail
- Explicit instructions: "You MUST run the complete test suite"
Moving Forward Checklist
Before adding multi-agent complexity:
- Genuine constraints exist (context limits, parallelization, specialization need)
- Decomposition follows context, not problem type
- Clear verification points where subagents can validate
Start with simplest approach that works. Add complexity only when evidence supports it.
References
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
analogy-commentary
Turn source content into a familiar analogy framework, then produce concise commentary angles and a short take. Use when the user asks whether a post, article, speech, or idea is “like” a known story, theory, faction, historical pattern, or strategic lens.
deep-productivity
Master deep work productivity through the three types of work framework (Building, Maintenance, Recovery). Use when user needs to: (1) Build a sustainable deep work routine with just 1 hour/day, (2) Create vision/anti-vision for life direction, (3) Structure goals using the 10-year → 1-year → 1-month → 1-week hierarchy, (4) Apply project-based learning to bridge skill gaps, (5) Identify lever-moving tasks that actually progress goals, (6) Balance focus work with necessary recovery for creativity.
agent-first-product-strategy
Reframe AI product and SaaS strategy from human-user assumptions to agent-first execution. Use when redefining product positioning, success metrics, API/docs priorities, go-to-market, or roadmap decisions for an AI-native market where agents are primary software users.
claim-ledger
将研究与写作中的主张转成“Claim-Evidence-Boundary”可核验账本。用于在产出前强制补齐证据、反证、边界与可发布性判断,防止无依据断言并提升复用性。
debog-yourself
Help users identify and escape psychological traps that are holding them back. Use when user feels stuck, unable to progress, facing a deadlock, or experiencing decision paralysis. Provide diagnostic frameworks and specific strategies to help users understand which trap pattern they're in and find a path forward.
china-model-selection-guide
China model selection and task-routing guide for Doubao-Seed-2.0-Code, GLM-5, MiniMax-M2.5, and Kimi-K2.5. Use when users need to choose the best-fit model by input type, task complexity, engineering constraints, and delivery goals, including staged multi-model workflows.
Didn't find tool you were looking for?