Agent skill

babysit-pr

Use when the user asks to monitor, watch, or babysit a PR until CI passes or requires intervention. Polls CI status, delegates failure log analysis to a subagent, auto-fixes branch-related issues, and stops when the PR is green or blocked.

View SKILL.md on GitHub Repository

Stars 8

Forks 1

Install this agent skill to your Project

npx add-skill https://github.com/rstacruz/agentic-toolkit/tree/main/skill/atk-extras/babysit-pr

SKILL.md

babysit-pr

Monitor a PR's CI status in a loop. Delegate log analysis to a subagent to keep the main context lean. Fix branch-related failures and push. Stop when CI passes or a blocker requires user input.

Inputs

Accept any of:

No argument — auto-detect PR from current branch
PR number
PR URL

Setup

Find the directory containing this SKILL.md at runtime — that is SKILL_DIR. The polling script lives at $SKILL_DIR/scripts/poll-pr.sh.

Run it as:

bash

bash "$SKILL_DIR/scripts/poll-pr.sh" [<pr-number-or-url>]

Requires gh (≥ 2.32) and jq.

Main loop

LOOP:
  snapshot = poll-pr.sh [<pr>]

  status=merged  → "PR #<n> merged." — STOP
  status=closed  → "PR #<n> closed without merging." — STOP
  status=passing → "CI green. PR #<n> is ready to merge." — STOP

  status=pending →
    unchanged_count += 1
    interval = min(60 × 2^(floor(unchanged_count / 3)), 300) seconds
    wait interval, loop

  status=failing →
    unchanged_count = 0
    CHANGED_FILES=$(gh pr view {{PR_NUM}} --json files --jq '.files[].path')
    Spawn ONE @general subagent with {{FAILING_RUN_IDS}} and {{CHANGED_FILES}}.
    Use the prompt from ## CI failure analysis.

    Read classification:
      branch-related →
        Check worktree: if unrelated uncommitted changes exist → STOP, ask user
        Fix the code
        git add, commit, and push
        If push rejected or auth/permission error → STOP, report
        unchanged_count = 0, loop

      flaky/infra →
        For each failing run_id:
          retry_count[run_id] += 1
          if retry_count[run_id] > 3 → STOP, "Flaky retry budget exhausted: <workflow_name>"
        gh run rerun <run-id> --failed  (for each failing run)
        loop

      uncertain →
        STOP — report what was found and ask user to classify manually

Output cadence

Emit a progress update only on status changes (pending → failing → pushing, etc.)
Emit a heartbeat every 5 unchanged pending polls ("still pending, next check in Xs…")
On stop: output a final summary — PR number, final status, commits pushed, retries used

CI failure analysis

Send this prompt to the subagent, substituting {{PR_NUM}}, {{CHANGED_FILES}}, and {{FAILING_RUN_IDS}}:

Analyse CI failures for PR #{{PR_NUM}}.

Files changed by this PR:
{{CHANGED_FILES}}

For each run ID in {{FAILING_RUN_IDS}}, run:
  gh run view <id> --log-failed
  gh run view <id> --json jobs,workflowName,headSha,url --jq '{workflowName,headSha,url,jobs:[.jobs[]|{name,conclusion,steps:[.steps[]|select(.conclusion=="failure")|{name,conclusion}]}]}'

Classify the overall failure set as: branch-related, flaky/infra, OR uncertain.

CLASSIFICATION SIGNALS

Strong branch-related signals (if any present → classify branch-related):
- Compile, type, build, or lint errors (regardless of which file is reported — errors surface in downstream files too)
- Stack traces or error messages referencing files from CHANGED_FILES
- New logical assertion failures in code introduced by this PR

Strong flaky/transient signals: transient infra errors (timeouts, connection refused, resource exhaustion, runner startup failures)

Weak/ambiguous signals (supporting evidence only — not sufficient alone):
- UNKNOWN STEP in logs (can be a gh CLI log-association limitation)
- No step logs (can be a gh fetch artifact, not necessarily infra failure)
- Test file not in CHANGED_FILES (most real regressions break untouched tests)

DECISION RULE (conservative — bias toward caution):
  Any strong branch-related signal present?
    → branch-related: describe exactly what to fix and in which file(s)
  Strong flaky/transient signals only, no strong branch-related signals?
    → flaky/infra: confirm no changed code is implicated
  Mixed signals or ambiguous?
    → uncertain: summarise what was found and why it's unclear

Note: gh run view --log-failed output can be lossy (truncated logs, misattributed steps).
If evidence is insufficient for confidence, classify as uncertain.

Maintainer

rstacruz Core maintainer

Source details

Full Name: rstacruz/agentic-toolkit
Branch: main
Path in repo: skill/atk-extras/babysit-pr
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

rstacruz/agentic-toolkit

refine-implementation

Use after implementation to simplify and review code. Provide: git range (eg, main...HEAD). Runs simplify + peer review loop until change set is clean.

8 1

Explore

rstacruz/agentic-toolkit

implement-spec

Implements a spec on a ticket-by-ticket basis.

8 1

Explore

rstacruz/agentic-toolkit

spec-product-requirements

Gives important guidelines to define product requirements sections (functional requirements, technical requirements, constraints, design considerations, diagrams). Companion to $spec-mode.

8 1

Explore

rstacruz/agentic-toolkit

coding-practices

Contains important guidelines for software engineering, coding, programming. Includes (but not limited to): - CP1: Functional core, imperative shell - CP2: Operational vs unexpected errors - CP3: Result-oriented interface pattern - CP4: Presentational vs container components - CP5: Log context builder pattern Use this when writing, editing, debugging, planning, or otherwise working with: - Any programming work - UI components in React, Vue, or similar - JavaScript, TypeScript, Rust, or any programming language

8 1

Explore

rstacruz/agentic-toolkit

spec-tech-design

Gives important guidelines to define technical design sections (call graphs, data models, pseudocode, files, CSS classes, testing strategy). Companion to $spec-mode skill.

8 1

Explore

rstacruz/agentic-toolkit

review-changes

Use when reviewing code changes against a plan. Provide: plan/spec doc; git range or changed files (eg, branch...HEAD). Returns P1/P2/P3 on alignment, quality, bugs, security.

8 1

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

babysit-pr

Inputs

Setup

Main loop

Output cadence

CI failure analysis

Recommended Agent Skills

refine-implementation

implement-spec

spec-product-requirements

coding-practices

spec-tech-design

review-changes