Agent skill
verify-completion
Enforce verification evidence before any completion claim. Tests, builds, and lint must be run with output shown — no claims without proof. Triggers on "verify", "verification", "done check", "completion check".
Install this agent skill to your Project
npx add-skill https://github.com/devseunggwan/my-skills/tree/main/skills/verify-completion
SKILL.md
Verify Completion
Overview
Claiming work is complete without verification is dishonesty, not efficiency.
Core principle: Evidence before claims, always.
Replaces: superpowers:verification-before-completion
Delegates to: OMC ultraqa (test → verify → fix → repeat cycle)
The Iron Law
NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
If you haven't run the verification command in this message, you cannot claim it passes.
Violating the letter of this rule is violating the spirit of this rule.
When to Use
ALWAYS before:
- ANY variation of "done", "fixed", "complete", "passes"
- ANY expression of satisfaction ("Great!", "Perfect!")
- Committing, pushing, or creating PRs
- Moving to the next task
- Trusting agent success reports
Workflow position:
[implementation] → [code-review] → [verify-completion] → [create-hub-pr]
The Gate Function
BEFORE claiming any status or expressing satisfaction:
1. IDENTIFY: What command proves this claim?
2. RUN: Execute the FULL command (fresh, complete)
3. READ: Full output, check exit code, count failures
4. VERIFY: Does output confirm the claim?
- If NO: State actual status with evidence
- If YES: State claim WITH evidence
5. ONLY THEN: Make the claim
Skip any step = lying, not verifying
Process
Step 1: Identify Verification Targets
| Target | Typical Command | Required? |
|---|---|---|
| Unit tests | pytest -v, npm test, go test ./... |
Always |
| Lint | ruff check ., eslint ., golangci-lint run |
Always |
| Build | npm run build, cargo build, go build ./... |
If build system exists |
| Type check | mypy ., tsc --noEmit |
If type system exists |
| Functional test | DAG trigger, API call, CLI execution | If external systems changed |
Step 2: Run Verification Cycle
Delegate to OMC ultraqa when available:
ultraqa cycle: test → verify → fix (on failure) → repeat (until pass)
If ultraqa is unavailable, run manually:
# 1. Tests
pytest -v <test-path>
# 2. Lint
ruff check . && ruff format --check .
# 3. Build (if applicable)
npm run build
# 4. Functional test (if applicable)
hubctl exec <env> airflow dags trigger <dag_id>
Step 3: Report Evidence
Show actual output for each verification target:
✅ Verification results:
- Tests: 34/34 pass (0 failures) — `pytest -v` output confirmed
- Lint: 0 errors, 0 warnings — `ruff check .` output confirmed
- Build: exit code 0 — `npm run build` output confirmed
Step 4: Handle Failures
If verification fails:
- Report failure with exact output (include error messages)
- Fix immediately (do not ask user for permission)
- Re-run from Step 2
- If 2+ failures on same issue → use
debugskill for root cause analysis
Common Failures
| Claim | Requires | Not Sufficient |
|---|---|---|
| Tests pass | Test command output: 0 failures | Previous run, "should pass" |
| Linter clean | Linter output: 0 errors | Partial check, extrapolation |
| Build succeeds | Build command: exit 0 | Linter passing, "logs look good" |
| Bug fixed | Test original symptom: passes | "Code changed, assumed fixed" |
| Requirements met | Line-by-line checklist verified | "Tests pass" alone |
| API works | Response body content verified | HTTP 200 status code alone |
Rationalization Prevention
| Excuse | Reality |
|---|---|
| "Should work now" | RUN the verification |
| "I'm confident" | Confidence ≠ evidence |
| "Just this once" | No exceptions |
| "Linter passed so build will too" | Linter ≠ compiler |
| "Agent said success" | Verify independently |
| "I'm tired" | Exhaustion ≠ excuse |
| "Partial check is enough" | Partial proves nothing |
| "Different wording so rule doesn't apply" | Spirit over letter |
| "Simple change, no verification needed" | Simple changes break too. Run it. |
| "HTTP 200 means it works" | Check response body content, not just status code |
Red Flags — STOP
If you catch yourself about to:
- Use "should", "probably", "seems to"
- Express satisfaction before verification ("Great!", "Done!")
- Commit / push / create PR without verification
- Trust agent success reports at face value
- Rely on partial verification
- Think "just this once"
- Use ANY wording implying success without having run verification
ALL of these mean: STOP. Run the Gate Function.
Integration
Previous step: code-review (code quality self-review)
Next step: create-hub-pr (PR creation)
OMC delegation:
ultraqa: test → verify → fix → repeat cycledebuggeragent: root cause analysis on persistent failures
Didn't find tool you were looking for?