Agent skill

sym-debug

Investigate stuck runs and execution failures by tracing Symphony and Codex logs with issue/session identifiers; use when runs stall, retry repeatedly, or fail unexpectedly.

View SKILL.md on GitHub Repository

Stars 28

Forks 2

Install this agent skill to your Project

npx add-skill https://github.com/gannonh/kata/tree/main/apps/symphony/skills/sym-debug

SKILL.md

Debug

Goals

Find why a run is stuck, retrying, or failing.
Correlate Linear issue identity to a Codex session quickly.
Read the right logs in the right order to isolate root cause.

Log Sources

Primary runtime log file: <logs-root>/log/symphony.log
- When Symphony runs with --logs-root, it writes rotating JSON logs under this path (see apps/symphony/README.md).
- Includes orchestrator, agent runner, and Codex app-server lifecycle logs.
Rotated runtime logs: <logs-root>/log/symphony.log*
- Check these when the relevant run is older than the active file.
Stdout fallback: structured JSON log stream
- Without --logs-root, logs stream to stdout instead of a file.

Correlation Keys

issue_identifier: human ticket key (example: MT-625)
issue_id: Linear UUID (stable internal ID)
session_id: Codex thread-turn pair (<thread_id>-<turn_id>)

These fields are emitted by Symphony runtime lifecycle logs (notably in apps/symphony/src/orchestrator.rs and apps/symphony/src/codex/app_server.rs). Use them as your join keys during debugging.

Quick Triage (Stuck Run)

Confirm scheduler/worker symptoms for the ticket.
Find recent lines for the ticket (issue_identifier first).
Extract session_id from matching lines.
Trace that session_id across start, stream, completion/failure, and stall handling logs.
Decide class of failure: timeout/stall, app-server startup failure, turn failure, or orchestrator retry loop.

Commands

bash

# File-log mode (`--logs-root` enabled): expand to active + rotated files.
LOG_PATHS=( ${LOG_GLOB:-log/symphony.log*} )

# 1) Narrow by ticket key (fastest entry point)
rg -n "issue_identifier=MT-625" "${LOG_PATHS[@]}"

# 2) If needed, narrow by Linear UUID
rg -n "issue_id=<linear-uuid>" "${LOG_PATHS[@]}"

# 3) Pull session IDs seen for that ticket
rg -o "session_id=[^ ;]+" "${LOG_PATHS[@]}" | sort -u

# 4) Trace one session end-to-end
rg -n "session_id=<thread>-<turn>" "${LOG_PATHS[@]}"

# 5) Focus on stuck/retry signals
rg -n "Issue stalled|scheduling retry|turn_timeout|turn_failed|Codex session failed|Codex session ended with error" "${LOG_PATHS[@]}"

# Stdout mode (startup banner shows `Logs: stdout`): use your runtime stream.
journalctl -u symphony --since "30 minutes ago" --no-pager \
  | rg -n "issue_identifier=MT-625|issue_id=<linear-uuid>|session_id=<thread>-<turn>|Issue stalled|scheduling retry|turn_timeout|turn_failed|Codex session failed|Codex session ended with error"

# Containerized deploys can use docker logs instead of journalctl.
docker logs <symphony-container> --since 30m 2>&1 \
  | rg -n "issue_identifier=MT-625|issue_id=<linear-uuid>|session_id=<thread>-<turn>|Issue stalled|scheduling retry|turn_timeout|turn_failed|Codex session failed|Codex session ended with error"

Investigation Flow

Locate the ticket slice:
- Search by issue_identifier=<KEY>.
- If noise is high, add issue_id=<UUID>.
Establish timeline:
- Identify first Codex session started ... session_id=....
- Follow with Codex session completed, ended with error, or worker exit lines.
Classify the problem:
- Stall loop: Issue stalled ... restarting with backoff.
- App-server startup: Codex session failed ....
- Turn execution failure: turn_failed, turn_cancelled, turn_timeout, or ended with error.
- Worker crash: Agent task exited ... reason=....
Validate scope:
- Check whether failures are isolated to one issue/session or repeating across multiple tickets.
Capture evidence:
- Save key log lines with timestamps, issue_identifier, issue_id, and session_id.
- Record probable root cause and the exact failing stage.

Reading Codex Session Logs

In Symphony, Codex session diagnostics are emitted into log/symphony.log and keyed by session_id. Read them as a lifecycle:

Codex session started ... session_id=...
Session stream/lifecycle events for the same session_id
Terminal event:
- Codex session completed ..., or
- Codex session ended with error ..., or
- Issue stalled ... restarting with backoff

For one specific session investigation, keep the trace narrow:

Capture one session_id for the ticket.
Build a timestamped slice for only that session:
- rg -n "session_id=<thread>-<turn>" "$LOG_GLOB"
Mark the exact failing stage:
- Startup failure before stream events (Codex session failed ...).
- Turn/runtime failure after stream events (turn_* / ended with error).
- Stall recovery (Issue stalled ... restarting with backoff).
Pair findings with issue_identifier and issue_id from nearby lines to confirm you are not mixing concurrent retries.

Always pair session findings with issue_identifier/issue_id to avoid mixing concurrent runs.

Notes

Prefer rg over grep for speed on large logs.
Check rotated logs (<logs-root>/log/symphony.log*) before concluding data is missing.
If required context fields are missing in new log statements, align with existing structured lifecycle logging in apps/symphony/src/orchestrator.rs and apps/symphony/src/codex/app_server.rs.

Maintainer

gannonh Core maintainer

Source details

Full Name: gannonh/kata
Branch: main
Path in repo: apps/symphony/skills/sym-debug
License: Apache License 2.0
Topics: claude-code agents codex agentic-workflow agentic-ai

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

gannonh/kata

kata-context

Structural and semantic codebase intelligence with persistent memory — index TypeScript and Python repos into a knowledge graph with vector embeddings, query symbol dependencies, run semantic search by intent, search code patterns, fuzzy-find symbols, and persist/recall agent memories with git audit trail. Use when you need to understand code structure, find what depends on a symbol, trace dependencies, search by meaning, search for code patterns, find symbols by name, or remember/recall project decisions, patterns, and learnings.

28 2

Explore

gannonh/kata

claude-md-improver

Audit and improve CLAUDE.md files in repositories. Use when user asks to check, audit, update, improve, or fix CLAUDE.md files. Scans for all CLAUDE.md files, evaluates quality against templates, outputs quality report, then makes targeted updates. Also use when the user mentions "CLAUDE.md maintenance" or "project memory optimization".

28 2

Explore

gannonh/kata

frontend-design

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, artifacts, posters, or applications (examples include websites, landing pages, dashboards, React components, HTML/CSS layouts, or when styling/beautifying any web UI). Generates creative, polished code and UI design that avoids generic AI aesthetics.

28 2

Explore

gannonh/kata

debug-like-expert

Deep analysis debugging mode for complex issues. Activates methodical investigation protocol with evidence gathering, hypothesis testing, and rigorous verification. Use when standard troubleshooting fails or when issues require systematic root cause analysis.

28 2

Explore

gannonh/kata

swiftui

SwiftUI apps from scratch through App Store. Full lifecycle - create, debug, test, optimize, ship.

28 2

Explore

gannonh/kata

sym-address-comments

Help address review/issue comments on the open GitHub PR for the current branch using gh CLI; verify gh auth first and prompt the user to authenticate if not logged in.

28 2

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Debug

Goals

Log Sources

Correlation Keys

Quick Triage (Stuck Run)

Commands

Investigation Flow

Reading Codex Session Logs

Notes

Recommended Agent Skills

kata-context

claude-md-improver

frontend-design

debug-like-expert

swiftui

sym-address-comments