Agent skill

swarm-local-e2e

Guide for running local E2E tests with API server, Docker lead/worker containers, task creation, log verification, UI dashboard, and cleanup

View SKILL.md on GitHub Repository

Stars 335

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/desplega-ai/agent-swarm/tree/main/.claude/skills/swarm-local-e2e

SKILL.md

Local E2E Testing Guide

Run full end-to-end tests of the agent swarm locally with a real API server and Docker containers.

When to Use This Skill

This skill should be invoked in two modes:

User-requested QA: The user asks you to run E2E tests, verify a feature, or QA a specific flow. Follow the steps below targeting what they asked for.
Automated change verification: After implementing changes that touch the API, runner, polling, task lifecycle, session logs, Docker entrypoint, or worker/lead behavior — use this skill proactively to verify the changes work end-to-end. Determine what's testable based on the diff:
- Task lifecycle changes (poll, runner, store-progress): Create assigned + pool tasks, verify they complete and have correct logs
- Session log changes: Run two sequential tasks on the same agent, verify log isolation (unique sessionIds, no cross-contamination)
- Docker / entrypoint changes: Build image, start containers, verify boot logs and registration
- UI changes: Start the dashboard, use agent-browser/qa-use to verify rendering
- API endpoint changes: Call the endpoint directly and verify the response

You do not need to run every step — pick the subset relevant to the changes being tested.

Prerequisites

OrbStack or Docker Desktop running (open -a OrbStack if needed)
.env with API_KEY and PORT configured
.env.docker-lead with lead config (AGENT_ID, CLAUDE_CODE_OAUTH_TOKEN, MCP_BASE_URL)
.env.docker with worker config (AGENT_ID, CLAUDE_CODE_OAUTH_TOKEN or OPENROUTER_API_KEY, MCP_BASE_URL)

Step 1: Determine Your Port

Check .env for the configured port — do not assume 3013:

bash

grep ^PORT= .env

Use this value as $PORT throughout. In worktrees, each worktree may have a different port. Always verify and use the value from .env.

Also verify the Docker env files match:

bash

grep MCP_BASE_URL .env.docker-lead .env.docker
# Both should point to http://host.docker.internal:$PORT

If they don't match, update them before starting containers.

Step 2: Clean DB + Start API Server

bash

# Kill any existing API process on your port
lsof -ti :$PORT | xargs kill 2>/dev/null

# Clean DB for fresh state
rm -f agent-swarm-db.sqlite agent-swarm-db.sqlite-wal agent-swarm-db.sqlite-shm

# Start API server
bun run start:http &
# Wait ~3s for startup, confirm "MCP HTTP server running on http://localhost:$PORT/mcp"

Step 3: Build Docker Image

bash

bun run docker:build:worker

This builds agent-swarm-worker:latest from the current code. Rebuild after every code change.

Step 4: Start Lead Container

Use a unique container name to avoid conflicts with other worktrees (e.g. include branch name or feature):

bash

docker run --rm -d \
  --name e2e-lead-$(git branch --show-current | tr '/' '-') \
  --env-file .env.docker-lead \
  -e AGENT_ROLE=lead \
  -e MAX_CONCURRENT_TASKS=1 \
  -p 3201:3000 \
  agent-swarm-worker:latest

Wait ~15s, then verify:

bash

docker logs e2e-lead-$(git branch --show-current | tr '/' '-') 2>&1 | tail -5
# Should see: "[lead] Polling for triggers (0/1 active)..."

If port 3201 is taken by another worktree, pick a different host port (e.g. -p 3211:3000).

Step 5: Start Worker Container

bash

docker run --rm -d \
  --name e2e-worker-$(git branch --show-current | tr '/' '-') \
  --env-file .env.docker \
  -e MAX_CONCURRENT_TASKS=1 \
  -p 3203:3000 \
  agent-swarm-worker:latest

Wait ~15s, then verify:

bash

docker logs e2e-worker-$(git branch --show-current | tr '/' '-') 2>&1 | tail -5
# Should see: "[worker] Polling for triggers (0/1 active)..."

Step 6: Verify Registration

Use context-mode execute (not curl directly due to hook restrictions):

javascript

const headers = { 'Authorization': 'Bearer $API_KEY', 'Content-Type': 'application/json' };
const agents = await (await fetch('http://localhost:$PORT/api/agents', { headers })).json();
for (const a of agents.agents) {
  console.log(`${a.name} | isLead: ${a.isLead} | status: ${a.status} | id: ${a.id}`);
}

Should show both lead and worker registered as idle. Save the agent IDs for task creation.

Step 7: Create Tasks

Assigned task (picked up by lead)

javascript

const t = await (await fetch('http://localhost:$PORT/api/tasks', {
  method: 'POST', headers,
  body: JSON.stringify({ task: 'Say hello. Call store-progress with status completed.', agentId: LEAD_ID })
})).json();
console.log('Task:', t.id, '| status:', t.status);

Important: Use agentId (not assignedTo) to assign tasks. Wrong param silently creates an unassigned task.

Pool task (auto-claimed by worker)

javascript

const t = await (await fetch('http://localhost:$PORT/api/tasks', {
  method: 'POST', headers,
  body: JSON.stringify({ task: 'Say hello. Call store-progress with status completed.' })
})).json();
console.log('Pool task:', t.id, '| status:', t.status);

Workers auto-claim unassigned tasks at poll time. Leads do not auto-claim pool tasks.

Step 8: Monitor Progress

bash

# Watch lead logs (use your container name)
docker logs -f e2e-lead-$(git branch --show-current | tr '/' '-') 2>&1 | tail -20

# Watch worker logs
docker logs -f e2e-worker-$(git branch --show-current | tr '/' '-') 2>&1 | tail -20

Poll task status:

javascript

const t = await (await fetch('http://localhost:$PORT/api/tasks/<task-id>', { headers })).json();
console.log(t.status);  // pending → in_progress → completed/failed

Step 9: Verify Session Logs

javascript

const logs = await (await fetch('http://localhost:$PORT/api/tasks/<task-id>/session-logs', { headers })).json();
console.log('Log count:', logs.logs.length);
// Should be > 0 for completed tasks

For log isolation verification (multiple sequential tasks from same agent):

javascript

const [l1, l2] = await Promise.all([
  fetch('http://localhost:$PORT/api/tasks/<task1>/session-logs', { headers }).then(r => r.json()),
  fetch('http://localhost:$PORT/api/tasks/<task2>/session-logs', { headers }).then(r => r.json()),
]);
const s1 = [...new Set(l1.logs.map(l => l.sessionId))];
const s2 = [...new Set(l2.logs.map(l => l.sessionId))];
console.log('Unique sessionIds:', s1[0] !== s2[0]);  // Should be true

Step 10: Test the Dashboard UI

Start the dashboard to visually verify tasks, logs, and agent status:

bash

cd new-ui && pnpm run dev &
# Defaults to port from APP_URL in .env (check with: grep APP_URL ../.env)

If the UI port is taken by another worktree, start on an alternate:

bash

cd new-ui && pnpm run dev --port 5276

The UI connects to the API via VITE_API_URL (check new-ui/.env or defaults to http://localhost:$PORT).

Visual verification with agent-browser / qa-use

Use agent-browser or qa-use to automate UI checks:

bash

# Quick visual gut-check with agent-browser
agent-browser --url http://localhost:5175 snapshot

# Or use qa-use to verify specific flows
qa-use explore http://localhost:5175

Things to verify in the UI:

Agents page: Lead and worker both show as registered with correct status
Tasks page: Tasks appear with correct status, assigned agent, and timestamps
Task detail → Logs tab: Session logs render in the conversation viewer (not "No session data available")
Task detail → Outcome tab: Completed tasks show output
Costs: Session costs appear for completed tasks

Step 11: Cleanup

bash

# Stop containers (use your branch-specific names)
docker stop e2e-lead-$(git branch --show-current | tr '/' '-') e2e-worker-$(git branch --show-current | tr '/' '-') 2>/dev/null

# Stop API server
lsof -ti :$PORT | xargs kill 2>/dev/null

# Stop UI dev server (if started)
lsof -ti :5175 | xargs kill 2>/dev/null

Troubleshooting

Docker daemon not running

ERROR: Cannot connect to the Docker daemon

Fix: open -a OrbStack and wait ~5s.

Container name conflict

docker: Error response from daemon: Conflict. The container name "..." is already in use

Another worktree has a container with the same name. Either stop it (docker stop <name>) or use a different name suffix.

Lead not picking up tasks

Verify task was created with agentId (not assignedTo) — wrong param silently creates an unassigned task
Check task status isn't already in_progress (e.g. from a manual poll call that consumed the trigger)
Restart container if stuck: docker restart <container-name>

Worker not picking up pool tasks

Workers auto-claim via poll. Leads do not claim pool tasks.
Check worker has capacity: docker logs <container> 2>&1 | grep "capacity"
If "At capacity" — a previous task is still running. Wait or restart.

Poll returns 404

Poll endpoint is GET /api/poll (not POST)
Requires X-Agent-ID header with a valid agent UUID

Port conflicts (worktrees)

bash

lsof -i :3013  # Check what's using the port

If another worktree is running, set a different PORT in .env and update MCP_BASE_URL in .env.docker* to http://host.docker.internal:<new-port>.

Session logs show 0 entries

Task must have actually run (status completed or failed, not just in_progress)
Check claudeSessionId is set on the task: GET /api/tasks/<id> should show it
If logs were stored under wrong taskId, check the session_logs table directly

Task cancellation doesn't stop Claude

Direct API cancellation (POST /api/tasks/<id>/cancel) updates the DB but doesn't kill the Claude process inside Docker. Use docker restart <container> to force-stop.

Keep tasks trivial

Use simple tasks like "Say hello" for E2E tests. Complex tasks waste time and API credits.

UI shows stale data

The dashboard auto-polls every 5 seconds. If data looks stale, hard-refresh (Cmd+Shift+R) or check VITE_API_URL points to the correct API port.

Maintainer

desplega-ai Core maintainer

Source details

Full Name: desplega-ai/agent-swarm
Branch: main
Path in repo: .claude/skills/swarm-local-e2e
License: MIT License
Topics: claude-code claude ai-agents swarm

Featured Tools

Join Our Newsletter

Effective communication within the agent swarm using internal Slack

335 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Local E2E Testing Guide

When to Use This Skill

Prerequisites

Step 1: Determine Your Port

Step 2: Clean DB + Start API Server

Step 3: Build Docker Image

Step 4: Start Lead Container

Step 5: Start Worker Container

Step 6: Verify Registration

Step 7: Create Tasks

Assigned task (picked up by lead)

Pool task (auto-claimed by worker)

Step 8: Monitor Progress

Step 9: Verify Session Logs

Step 10: Test the Dashboard UI

Visual verification with agent-browser / qa-use

Step 11: Cleanup

Troubleshooting

Docker daemon not running

Container name conflict

Lead not picking up tasks

Worker not picking up pool tasks

Poll returns 404

Port conflicts (worktrees)

Session logs show 0 entries

Task cancellation doesn't stop Claude

Keep tasks trivial

UI shows stale data

Recommended Agent Skills

implement-issue

start-leader

investigate-sentry-issue

user-management

close-issue

swarm-chat