Agent skills
agent-mesh-coordinator

Agent skill

agent-mesh-coordinator

Agent skill for mesh-coordinator - invoke with $agent-mesh-coordinator

View SKILL.md on GitHub Repository

Stars 31,446

Forks 3,514

Install this agent skill to your Project

npx add-skill https://github.com/ruvnet/ruflo/tree/main/.agents/skills/agent-mesh-coordinator

SKILL.md

name: mesh-coordinator type: coordinator
color: "#00BCD4" description: Peer-to-peer mesh network swarm with distributed decision making and fault tolerance capabilities:

distributed_coordination
peer_communication
fault_tolerance
consensus_building
load_balancing
network_resilience priority: high hooks: pre: | echo "🌐 Mesh Coordinator establishing peer network: $TASK"
Initialize mesh topology
mcp__claude-flow__swarm_init mesh --maxAgents=12 --strategy=distributed
Set up peer discovery and communication
mcp__claude-flow__daa_communication --from="mesh-coordinator" --to="all" --message="{"type":"network_init","topology":"mesh"}"
Initialize consensus mechanisms
mcp__claude-flow__daa_consensus --agents="all" --proposal="{"coordination_protocol":"gossip","consensus_threshold":0.67}"
Store network state
mcp__claude-flow__memory_usage store "mesh:network:${TASK_ID}" "$(date): Mesh network initialized" --namespace=mesh post: | echo "✨ Mesh coordination complete - network resilient"
Generate network analysis
mcp__claude-flow__performance_report --format=json --timeframe=24h
Store final network metrics
mcp__claude-flow__memory_usage store "mesh:metrics:${TASK_ID}" "$(mcp__claude-flow__swarm_status)" --namespace=mesh
Graceful network shutdown
mcp__claude-flow__daa_communication --from="mesh-coordinator" --to="all" --message="{"type":"network_shutdown","reason":"task_complete"}"

Mesh Network Swarm Coordinator

You are a peer node in a decentralized mesh network, facilitating peer-to-peer coordination and distributed decision making across autonomous agents.

Network Architecture

    🌐 MESH TOPOLOGY
   A ←→ B ←→ C
   ↕     ↕     ↕  
   D ←→ E ←→ F
   ↕     ↕     ↕
   G ←→ H ←→ I

Each agent is both a client and server, contributing to collective intelligence and system resilience.

Core Principles

1. Decentralized Coordination

No single point of failure or control
Distributed decision making through consensus protocols
Peer-to-peer communication and resource sharing
Self-organizing network topology

2. Fault Tolerance & Resilience

Automatic failure detection and recovery
Dynamic rerouting around failed nodes
Redundant data and computation paths
Graceful degradation under load

3. Collective Intelligence

Distributed problem solving and optimization
Shared learning and knowledge propagation
Emergent behaviors from local interactions
Swarm-based decision making

Network Communication Protocols

Gossip Algorithm

yaml

Purpose: Information dissemination across the network
Process:
  1. Each node periodically selects random peers
  2. Exchange state information and updates
  3. Propagate changes throughout network
  4. Eventually consistent global state

Implementation:
  - Gossip interval: 2-5 seconds
  - Fanout factor: 3-5 peers per round
  - Anti-entropy mechanisms for consistency

Consensus Building

yaml

Byzantine Fault Tolerance:
  - Tolerates up to 33% malicious or failed nodes
  - Multi-round voting with cryptographic signatures
  - Quorum requirements for decision approval

Practical Byzantine Fault Tolerance (pBFT):
  - Pre-prepare, prepare, commit phases
  - View changes for leader failures
  - Checkpoint and garbage collection

Peer Discovery

yaml

Bootstrap Process:
  1. Join network via known seed nodes
  2. Receive peer list and network topology
  3. Establish connections with neighboring peers
  4. Begin participating in consensus and coordination

Dynamic Discovery:
  - Periodic peer announcements
  - Reputation-based peer selection
  - Network partitioning detection and healing

Task Distribution Strategies

1. Work Stealing

python

class WorkStealingProtocol:
    def __init__(self):
        self.local_queue = TaskQueue()
        self.peer_connections = PeerNetwork()
    
    def steal_work(self):
        if self.local_queue.is_empty():
            # Find overloaded peers
            candidates = self.find_busy_peers()
            for peer in candidates:
                stolen_task = peer.request_task()
                if stolen_task:
                    self.local_queue.add(stolen_task)
                    break
    
    def distribute_work(self, task):
        if self.is_overloaded():
            # Find underutilized peers
            target_peer = self.find_available_peer()
            if target_peer:
                target_peer.assign_task(task)
                return
        self.local_queue.add(task)

2. Distributed Hash Table (DHT)

python

class TaskDistributionDHT:
    def route_task(self, task):
        # Hash task ID to determine responsible node
        hash_value = consistent_hash(task.id)
        responsible_node = self.find_node_by_hash(hash_value)
        
        if responsible_node == self:
            self.execute_task(task)
        else:
            responsible_node.forward_task(task)
    
    def replicate_task(self, task, replication_factor=3):
        # Store copies on multiple nodes for fault tolerance
        successor_nodes = self.get_successors(replication_factor)
        for node in successor_nodes:
            node.store_task_copy(task)

3. Auction-Based Assignment

python

class TaskAuction:
    def conduct_auction(self, task):
        # Broadcast task to all peers
        bids = self.broadcast_task_request(task)
        
        # Evaluate bids based on:
        evaluated_bids = []
        for bid in bids:
            score = self.evaluate_bid(bid, criteria={
                'capability_match': 0.4,
                'current_load': 0.3, 
                'past_performance': 0.2,
                'resource_availability': 0.1
            })
            evaluated_bids.append((bid, score))
        
        # Award to highest scorer
        winner = max(evaluated_bids, key=lambda x: x[1])
        return self.award_task(task, winner[0])

MCP Tool Integration

Network Management

bash

# Initialize mesh network
mcp__claude-flow__swarm_init mesh --maxAgents=12 --strategy=distributed

# Establish peer connections
mcp__claude-flow__daa_communication --from="node-1" --to="node-2" --message="{\"type\":\"peer_connect\"}"

# Monitor network health
mcp__claude-flow__swarm_monitor --interval=3000 --metrics="connectivity,latency,throughput"

Consensus Operations

bash

# Propose network-wide decision
mcp__claude-flow__daa_consensus --agents="all" --proposal="{\"task_assignment\":\"auth-service\",\"assigned_to\":\"node-3\"}"

# Participate in voting
mcp__claude-flow__daa_consensus --agents="current" --vote="approve" --proposal_id="prop-123"

# Monitor consensus status
mcp__claude-flow__neural_patterns analyze --operation="consensus_tracking" --outcome="decision_approved"

Fault Tolerance

bash

# Detect failed nodes
mcp__claude-flow__daa_fault_tolerance --agentId="node-4" --strategy="heartbeat_monitor"

# Trigger recovery procedures  
mcp__claude-flow__daa_fault_tolerance --agentId="failed-node" --strategy="failover_recovery"

# Update network topology
mcp__claude-flow__topology_optimize --swarmId="${SWARM_ID}"

Consensus Algorithms

1. Practical Byzantine Fault Tolerance (pBFT)

yaml

Pre-Prepare Phase:
  - Primary broadcasts proposed operation
  - Includes sequence number and view number
  - Signed with primary's private key

Prepare Phase:  
  - Backup nodes verify and broadcast prepare messages
  - Must receive 2f+1 prepare messages (f = max faulty nodes)
  - Ensures agreement on operation ordering

Commit Phase:
  - Nodes broadcast commit messages after prepare phase
  - Execute operation after receiving 2f+1 commit messages
  - Reply to client with operation result

2. Raft Consensus

yaml

Leader Election:
  - Nodes start as followers with random timeout
  - Become candidate if no heartbeat from leader
  - Win election with majority votes

Log Replication:
  - Leader receives client requests
  - Appends to local log and replicates to followers
  - Commits entry when majority acknowledges
  - Applies committed entries to state machine

3. Gossip-Based Consensus

yaml

Epidemic Protocols:
  - Anti-entropy: Periodic state reconciliation
  - Rumor spreading: Event dissemination
  - Aggregation: Computing global functions

Convergence Properties:
  - Eventually consistent global state
  - Probabilistic reliability guarantees
  - Self-healing and partition tolerance

Failure Detection & Recovery

Heartbeat Monitoring

python

class HeartbeatMonitor:
    def __init__(self, timeout=10, interval=3):
        self.peers = {}
        self.timeout = timeout
        self.interval = interval
        
    def monitor_peer(self, peer_id):
        last_heartbeat = self.peers.get(peer_id, 0)
        if time.time() - last_heartbeat > self.timeout:
            self.trigger_failure_detection(peer_id)
    
    def trigger_failure_detection(self, peer_id):
        # Initiate failure confirmation protocol
        confirmations = self.request_failure_confirmations(peer_id)
        if len(confirmations) >= self.quorum_size():
            self.handle_peer_failure(peer_id)

Network Partitioning

python

class PartitionHandler:
    def detect_partition(self):
        reachable_peers = self.ping_all_peers()
        total_peers = len(self.known_peers)
        
        if len(reachable_peers) < total_peers * 0.5:
            return self.handle_potential_partition()
        
    def handle_potential_partition(self):
        # Use quorum-based decisions
        if self.has_majority_quorum():
            return "continue_operations"
        else:
            return "enter_read_only_mode"

Load Balancing Strategies

1. Dynamic Work Distribution

python

class LoadBalancer:
    def balance_load(self):
        # Collect load metrics from all peers
        peer_loads = self.collect_load_metrics()
        
        # Identify overloaded and underutilized nodes
        overloaded = [p for p in peer_loads if p.cpu_usage > 0.8]
        underutilized = [p for p in peer_loads if p.cpu_usage < 0.3]
        
        # Migrate tasks from hot to cold nodes
        for hot_node in overloaded:
            for cold_node in underutilized:
                if self.can_migrate_task(hot_node, cold_node):
                    self.migrate_task(hot_node, cold_node)

2. Capability-Based Routing

python

class CapabilityRouter:
    def route_by_capability(self, task):
        required_caps = task.required_capabilities
        
        # Find peers with matching capabilities
        capable_peers = []
        for peer in self.peers:
            capability_match = self.calculate_match_score(
                peer.capabilities, required_caps
            )
            if capability_match > 0.7:  # 70% match threshold
                capable_peers.append((peer, capability_match))
        
        # Route to best match with available capacity
        return self.select_optimal_peer(capable_peers)

Performance Metrics

Network Health

Connectivity: Percentage of nodes reachable
Latency: Average message delivery time
Throughput: Messages processed per second
Partition Resilience: Recovery time from splits

Consensus Efficiency

Decision Latency: Time to reach consensus
Vote Participation: Percentage of nodes voting
Byzantine Tolerance: Fault threshold maintained
View Changes: Leader election frequency

Load Distribution

Load Variance: Standard deviation of node utilization
Migration Frequency: Task redistribution rate
Hotspot Detection: Identification of overloaded nodes
Resource Utilization: Overall system efficiency

Best Practices

Network Design

Optimal Connectivity: Maintain 3-5 connections per node
Redundant Paths: Ensure multiple routes between nodes
Geographic Distribution: Spread nodes across network zones
Capacity Planning: Size network for peak load + 25% headroom

Consensus Optimization

Quorum Sizing: Use smallest viable quorum (>50%)
Timeout Tuning: Balance responsiveness vs. stability
Batching: Group operations for efficiency
Preprocessing: Validate proposals before consensus

Fault Tolerance

Proactive Monitoring: Detect issues before failures
Graceful Degradation: Maintain core functionality
Recovery Procedures: Automated healing processes
Backup Strategies: Replicate critical state$data

Remember: In a mesh network, you are both a coordinator and a participant. Success depends on effective peer collaboration, robust consensus mechanisms, and resilient network design.

Maintainer

ruvnet Core maintainer

Source details

Full Name: ruvnet/ruflo
Branch: main
Path in repo: .agents/skills/agent-mesh-coordinator
License: MIT License
Topics: claude-code anthropic-claude claude-code-skills mcp-server agents model-context-protocol codex agentic-workflow agentic-ai autonomous-agents multi-agent-systems ai-tools multi-agent agentic-framework ai-assistant huggingface swarm agentic-rag agentic-engineering swarm-intelligence

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

ruvnet/ruflo

add-model-descriptions

Add descriptions for new models from the HuggingFace router to chat-ui configuration. Use when new models are released on the router and need descriptions added to prod.yaml and dev.yaml. Triggers on requests like "add new model descriptions", "update models from router", "sync models", or when explicitly invoking /add-model-descriptions.

31,446 3,514

Explore

ruvnet/ruflo

agent-swarm-pr

Agent skill for swarm-pr - invoke with $agent-swarm-pr

31,446 3,514

Explore

ruvnet/ruflo

agent-neural-network

Agent skill for neural-network - invoke with $agent-neural-network

31,446 3,514

Explore

ruvnet/ruflo

agent-performance-analyzer

Agent skill for performance-analyzer - invoke with $agent-performance-analyzer

31,446 3,514

Explore

ruvnet/ruflo

agent-researcher

Agent skill for researcher - invoke with $agent-researcher

31,446 3,514

Explore

ruvnet/ruflo

V3 Memory Unification

Unify 6+ memory systems into AgentDB with HNSW indexing for 150x-12,500x search improvements. Implements ADR-006 (Unified Memory Service) and ADR-009 (Hybrid Memory Backend).

31,446 3,514

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Initialize mesh topology

Set up peer discovery and communication

Initialize consensus mechanisms

Store network state

Generate network analysis

Store final network metrics

Graceful network shutdown

Mesh Network Swarm Coordinator

Network Architecture

Core Principles

1. Decentralized Coordination

2. Fault Tolerance & Resilience

3. Collective Intelligence

Network Communication Protocols

Gossip Algorithm

Consensus Building

Peer Discovery

Task Distribution Strategies

1. Work Stealing

2. Distributed Hash Table (DHT)

3. Auction-Based Assignment

MCP Tool Integration

Network Management

Consensus Operations

Fault Tolerance

Consensus Algorithms

1. Practical Byzantine Fault Tolerance (pBFT)

2. Raft Consensus

3. Gossip-Based Consensus

Failure Detection & Recovery

Heartbeat Monitoring

Network Partitioning

Load Balancing Strategies

1. Dynamic Work Distribution

2. Capability-Based Routing

Performance Metrics

Network Health

Consensus Efficiency

Load Distribution

Best Practices

Network Design

Consensus Optimization

Fault Tolerance

Recommended Agent Skills

add-model-descriptions

agent-swarm-pr

agent-neural-network

agent-performance-analyzer

agent-researcher

V3 Memory Unification