Agent skill
babysit-pr
Use when the user asks to monitor, watch, or babysit a PR until CI passes or requires intervention. Polls CI status, delegates failure log analysis to a subagent, auto-fixes branch-related issues, and stops when the PR is green or blocked.
Install this agent skill to your Project
npx add-skill https://github.com/rstacruz/agentic-toolkit/tree/main/skill/atk-extras/babysit-pr
SKILL.md
babysit-pr
Monitor a PR's CI status in a loop. Delegate log analysis to a subagent to keep the main context lean. Fix branch-related failures and push. Stop when CI passes or a blocker requires user input.
Inputs
Accept any of:
- No argument — auto-detect PR from current branch
- PR number
- PR URL
Setup
Find the directory containing this SKILL.md at runtime — that is SKILL_DIR. The polling script lives at $SKILL_DIR/scripts/poll-pr.sh.
Run it as:
bash "$SKILL_DIR/scripts/poll-pr.sh" [<pr-number-or-url>]
Requires gh (≥ 2.32) and jq.
Main loop
LOOP:
snapshot = poll-pr.sh [<pr>]
status=merged → "PR #<n> merged." — STOP
status=closed → "PR #<n> closed without merging." — STOP
status=passing → "CI green. PR #<n> is ready to merge." — STOP
status=pending →
unchanged_count += 1
interval = min(60 × 2^(floor(unchanged_count / 3)), 300) seconds
wait interval, loop
status=failing →
unchanged_count = 0
CHANGED_FILES=$(gh pr view {{PR_NUM}} --json files --jq '.files[].path')
Spawn ONE @general subagent with {{FAILING_RUN_IDS}} and {{CHANGED_FILES}}.
Use the prompt from ## CI failure analysis.
Read classification:
branch-related →
Check worktree: if unrelated uncommitted changes exist → STOP, ask user
Fix the code
git add, commit, and push
If push rejected or auth/permission error → STOP, report
unchanged_count = 0, loop
flaky/infra →
For each failing run_id:
retry_count[run_id] += 1
if retry_count[run_id] > 3 → STOP, "Flaky retry budget exhausted: <workflow_name>"
gh run rerun <run-id> --failed (for each failing run)
loop
uncertain →
STOP — report what was found and ask user to classify manually
Output cadence
- Emit a progress update only on status changes (pending → failing → pushing, etc.)
- Emit a heartbeat every 5 unchanged pending polls ("still pending, next check in Xs…")
- On stop: output a final summary — PR number, final status, commits pushed, retries used
CI failure analysis
Send this prompt to the subagent, substituting {{PR_NUM}}, {{CHANGED_FILES}}, and {{FAILING_RUN_IDS}}:
Analyse CI failures for PR #{{PR_NUM}}.
Files changed by this PR:
{{CHANGED_FILES}}
For each run ID in {{FAILING_RUN_IDS}}, run:
gh run view <id> --log-failed
gh run view <id> --json jobs,workflowName,headSha,url --jq '{workflowName,headSha,url,jobs:[.jobs[]|{name,conclusion,steps:[.steps[]|select(.conclusion=="failure")|{name,conclusion}]}]}'
Classify the overall failure set as: branch-related, flaky/infra, OR uncertain.
CLASSIFICATION SIGNALS
Strong branch-related signals (if any present → classify branch-related):
- Compile, type, build, or lint errors (regardless of which file is reported — errors surface in downstream files too)
- Stack traces or error messages referencing files from CHANGED_FILES
- New logical assertion failures in code introduced by this PR
Strong flaky/transient signals: transient infra errors (timeouts, connection refused, resource exhaustion, runner startup failures)
Weak/ambiguous signals (supporting evidence only — not sufficient alone):
- UNKNOWN STEP in logs (can be a gh CLI log-association limitation)
- No step logs (can be a gh fetch artifact, not necessarily infra failure)
- Test file not in CHANGED_FILES (most real regressions break untouched tests)
DECISION RULE (conservative — bias toward caution):
Any strong branch-related signal present?
→ branch-related: describe exactly what to fix and in which file(s)
Strong flaky/transient signals only, no strong branch-related signals?
→ flaky/infra: confirm no changed code is implicated
Mixed signals or ambiguous?
→ uncertain: summarise what was found and why it's unclear
Note: gh run view --log-failed output can be lossy (truncated logs, misattributed steps).
If evidence is insufficient for confidence, classify as uncertain.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
refine-implementation
Use after implementation to simplify and review code. Provide: git range (eg, main...HEAD). Runs simplify + peer review loop until change set is clean.
implement-spec
Implements a spec on a ticket-by-ticket basis.
spec-product-requirements
Gives important guidelines to define product requirements sections (functional requirements, technical requirements, constraints, design considerations, diagrams). Companion to $spec-mode.
coding-practices
Contains important guidelines for software engineering, coding, programming. Includes (but not limited to): - CP1: Functional core, imperative shell - CP2: Operational vs unexpected errors - CP3: Result-oriented interface pattern - CP4: Presentational vs container components - CP5: Log context builder pattern Use this when writing, editing, debugging, planning, or otherwise working with: - Any programming work - UI components in React, Vue, or similar - JavaScript, TypeScript, Rust, or any programming language
spec-tech-design
Gives important guidelines to define technical design sections (call graphs, data models, pseudocode, files, CSS classes, testing strategy). Companion to $spec-mode skill.
review-changes
Use when reviewing code changes against a plan. Provide: plan/spec doc; git range or changed files (eg, branch...HEAD). Returns P1/P2/P3 on alignment, quality, bugs, security.
Didn't find tool you were looking for?