Agent skills
evidence-selfloop

Agent skill

evidence-selfloop

Evidence self-loop for surveys: read evidence bindings + evidence packs, then write an actionable upstream TODO plan (which stage/skill to fix) before writing more prose. Writes `output/EVIDENCE_SELFLOOP_TODO.md`. **Trigger**: evidence self-loop, evidence loop, evidence gaps, binding gaps, blocking_missing, 证据自循环, 证据缺口回路. **Use when**: C4 outputs exist (`outline/evidence_bindings.jsonl`, `outline/evidence_drafts.jsonl`) but writing looks hollow or C5 is BLOCKED due to thin evidence. **Skip if**: you are still pre-C3 (no notes/evidence bank yet), or you want to draft anyway and accept a lower evidence bar. **Network**: none. **Guardrail**: analysis-only; do not edit evidence/writing artifacts; do not invent facts/citations; only write the TODO report.

View SKILL.md on GitHub Repository

Stars 377

Forks 25

Install this agent skill to your Project

npx add-skill https://github.com/WILLOSCAR/research-units-pipeline-skills/tree/main/.codex/skills/evidence-selfloop

SKILL.md

Evidence Self-loop (C3/C4 fix → rebind → redraft)

Purpose: make the evidence-first pipeline converge without writing filler prose.

This skill reads the intermediate evidence artifacts (briefs/bindings/packs) and produces an actionable TODO list that answers:

Which subsections are under-supported?
Is the problem mapping/coverage (C2) or evidence extraction (C3) or binding/planning (C4)?
Which skill(s) should be rerun, in what order, to unblock high-quality writing?

Inputs

outline/subsection_briefs.jsonl
outline/evidence_bindings.jsonl (expects binding_gaps / binding_rationale if available)
outline/evidence_drafts.jsonl (expects blocking_missing, comparisons, eval protocol, limitations)
Optional (improves routing):
- outline/evidence_binding_report.md
- outline/anchor_sheet.jsonl
- papers/paper_notes.jsonl
- papers/fulltext_index.jsonl
- queries.md

Outputs

output/EVIDENCE_SELFLOOP_TODO.md (report-class; always written)

Self-loop contract (what “fixing evidence” means)

Prefer fixing upstream evidence, not writing around gaps.
If an evidence pack has blocking_missing, treat it as a STOP signal: strengthen notes/fulltext/mapping, then regenerate packs.
If bindings show binding_gaps, treat it as a ROUTING signal: either enrich the evidence bank for the mapped papers, expand mapping coverage, or adjust required_evidence_fields if unrealistic.

Recommended rerun chain (minimal):

If C3 evidence is thin: pdf-text-extractor → paper-notes → evidence-binder → evidence-draft → anchor-sheet → writer-context-pack
If C2 coverage is weak: section-mapper → outline-refiner → (then rerun C3/C4 evidence skills)

Workflow (analysis-only)

Read queries.md (if present)

Use it only as a soft config hint (evidence_mode / draft_profile); do not override the artifact contract.

Read outline/subsection_briefs.jsonl

For each sub_id, capture axes + required_evidence_fields (what evidence types this subsection expects).

Read outline/evidence_bindings.jsonl

For each sub_id, surface binding_rationale and binding_gaps (what the binder could/could not cover from the evidence bank).

(Optional) Read outline/evidence_binding_report.md

Use it as a human-readable summary; treat it as a view of outline/evidence_bindings.jsonl, not a separate truth source.

Read outline/evidence_drafts.jsonl

Surface blocking_missing (STOP signals), and check for missing comparisons / eval protocol / limitations that would force hollow writing.

(Optional) Read outline/anchor_sheet.jsonl

Check whether each subsection has at least a few citation-backed anchors (numbers / evaluation / limitations).

(Optional) Read papers/paper_notes.jsonl and papers/fulltext_index.jsonl

Use these to route fixes: if evidence is abstract-only and missing eval tokens, prefer enriching notes/fulltext before drafting prose.

What the report contains

Summary counts: subsections with blocking_missing, with binding_gaps, and common failure reasons.
Per-subsection TODO: the smallest upstream fix path (skills + artifacts) to make the subsection writeable.

Status semantics (unblock rules)

This skill is the prewrite router for evidence quality. Treat its Status: line as the unblock contract:

PASS: no blocking_missing and no binding_gaps -> proceed to C5 writing (but still scan non-blocking writability smells: low comparisons/eval/anchors often predict hollow prose).
OK: no blocking_missing, but some binding_gaps -> you may draft, but expect weaker specificity; prefer fixing gaps first.
FAIL: missing inputs OR any blocking_missing -> do not write filler prose; fix upstream and rerun C3/C4.

Routing matrix (symptom -> root cause -> upstream fix)

Use this as a semantic routing table (not a script checklist). The goal is to fix the earliest broken intermediate artifact.

Symptom (where you see it)	Likely root cause	Inspect first	Smallest upstream fix chain
`evidence_drafts.blocking_missing: no usable citation keys`	mapped papers lack `bibkey` / bibkeys not in `citations/ref.bib`	`papers/paper_notes.jsonl` (bibkey fields), `citations/ref.bib`	C3 `paper-notes` (ensure bibkeys) -> C4 `citation-verifier` -> rerun `evidence-binder` -> rerun `evidence-draft`
`blocking_missing: title-only evidence`	retrieval/metadata lacks abstracts (or aggressive filtering)	`papers/papers_raw.jsonl` abstracts, `papers/paper_notes.jsonl` evidence_level	C1 `literature-engineer` (enrich metadata) OR C3 `pdf-text-extractor` (fulltext) -> rerun `paper-notes`
`blocking_missing: no evidence snippets extractable`	notes are too thin / evidence bank empty for mapped papers	`papers/evidence_bank.jsonl` (counts), `papers/paper_notes.jsonl`	C3 `paper-notes` (richer extraction; prefer fulltext when possible) -> rerun C4 packs
`blocking_missing: no concrete evaluation tokens`	notes/bank did not extract benchmarks/metrics/budgets	`papers/paper_notes.jsonl` (metrics/benchmarks fields), `outline/anchor_sheet.jsonl`	C3 `paper-notes` (extract eval anchors) -> rerun `anchor-sheet` + `evidence-draft`
`evidence pack comparisons` are sparse (signals: comparisons low)	clusters are not contrastable OR mapping coverage too weak	`outline/subsection_briefs.jsonl` (clusters), `outline/mapping.tsv`	C2 `section-mapper` (coverage) OR C3 `subsection-briefs` (better clusters) -> rerun `evidence-draft`
`bindings.binding_gaps` mentions benchmarks/metrics/protocol	binder cannot find evaluation-tagged evidence for this subsection	`outline/evidence_binding_report.md` (tag mix), `papers/evidence_bank.jsonl` tags	C3 `paper-notes` (tag/evidence extraction) OR C2 expand mapping for that subsection -> rerun `evidence-binder`
`binding_gaps` mentions security/threat model/attacks	mapped set lacks security-focused works or notes lack threat-model detail	`outline/mapping.tsv`, `papers/paper_notes.jsonl`	C2 expand mapping (+ C1 queries if needed) OR C3 enrich notes -> rerun binder/packs
`binding report` looks mechanically uniform across H3 (same mix, low tag variance)	binder selection too recipe-like OR evidence bank tags too coarse	`outline/evidence_binding_report.md` (tag mix), evidence bank tags	tighten `required_evidence_fields` + improve evidence bank tags, then rerun binder; avoid writing around non-specific bindings

Interface with the writer self-loop (avoid writing around evidence)

If writer-selfloop is FAIL due to missing anchors/comparisons and the corresponding writer pack has pack_warnings, stop and run this evidence self-loop: the section is telling you the pack is not writeable.
Prefer fixing evidence gaps once, upstream, rather than patching every H3 with generic filler.

What this skill does NOT do

It does not edit papers/*, outline/*, or sections/*.
It does not invent new facts/citations.
It does not "relax" quality by changing thresholds; it routes you to the earliest artifact to fix.

Script

Quick Start

python .codex/skills/evidence-selfloop/scripts/run.py --workspace workspaces/<ws>

All Options

--workspace <dir>
--unit-id <U###> (optional)
--inputs <semicolon-separated> (optional override)
--outputs <semicolon-separated> (optional override; default writes output/EVIDENCE_SELFLOOP_TODO.md)
--checkpoint <C#> (optional)

Examples

Generate an evidence TODO list after C4 packs are generated:
- python .codex/skills/evidence-selfloop/scripts/run.py --workspace workspaces/<ws>

Maintainer

WILLOSCAR Core maintainer

Source details

Full Name: WILLOSCAR/research-units-pipeline-skills
Branch: main
Path in repo: .codex/skills/evidence-selfloop
Topics: claude-code claude skills codex gpt pipeline research research-paper research-project research-tool tools units vibe vibe-coding vibecoding

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

WILLOSCAR/research-units-pipeline-skills

thesis-compile-review

对中文毕业论文进行编译、warning 分级、模板模式检查、数据与引用复查，并把问题回写成可继续迭代的 review checklist。 **Trigger**: 毕业论文编译检查, thesis compile review, warning 分级, 终稿复查, main.pdf 检查. **Use when**: 论文已经回写到 TeX 交付层，需要确认是否真正达到“可提交”的质量，而不是只做到能编译。 **Skip if**: 还处于中间层重构阶段，`chapters/*.tex` 尚未形成稳定交付稿。 **Network**: none. **Guardrail**: 不在这里重构章节主线；如果发现结构问题，明确回退到上游修复。

377 25

Explore

WILLOSCAR/research-units-pipeline-skills

front-matter-writer

Write the survey's front matter files (Abstract, Introduction, Related Work, Discussion, Conclusion) in paper voice, with high citation density and a single evidence-policy paragraph. **Trigger**: front matter writer, introduction writer, related work writer, abstract writer, discussion writer, conclusion writer, 引言, 相关工作, 摘要, 讨论, 结论. **Use when**: you are in C5 (prose allowed) and need the paper-like shell to stop the draft reading like stitched subsections. **Skip if**: `Approve C2` is missing in `DECISIONS.md`, or `citations/ref.bib` is missing. **Network**: none. **Guardrail**: no invented facts/citations; no pipeline jargon in final prose; no repeated evidence disclaimers; only use keys present in `citations/ref.bib`.

377 25

Explore

WILLOSCAR/research-units-pipeline-skills

thesis-question-list

维护中文毕业论文的 `codex_md/question_list.md`：把本轮问题、边界、优先级、协作方案和验收口径结构化，作为整条 thesis pipeline 的控制面。 **Trigger**: 毕业论文问题清单, thesis question list, 论文修改清单, 本轮目标, 结构问题梳理, review问题整理. **Use when**: 你已经有一批材料或上一轮 review 结果，需要明确这一轮到底修什么、不修什么，并给后续重构与编译复查提供统一入口。 **Skip if**: 当前只是在做一次性局部措辞修改，且没有形成新一轮结构/证据/编译问题。 **Network**: none. **Guardrail**: 不在这里写正文；不把问题单写成长篇散文；每条问题必须可执行、可验收。

377 25

Explore

WILLOSCAR/research-units-pipeline-skills

novelty-matrix

Create a novelty/prior-work matrix comparing the submission’s contributions against related work (overlaps vs deltas). **Trigger**: novelty matrix, prior-work matrix, overlap/delta, 相关工作对比, 新颖性矩阵. **Use when**: peer review 中评估 novelty/positioning，需要把贡献与相关工作逐项对齐并写出差异点证据。 **Skip if**: 缺少 claims（先跑 `claims-extractor`）或你不打算做新颖性定位分析。 **Network**: none (retrieval of additional related work is out-of-scope unless provided). **Guardrail**: 明确 overlap 与 delta；尽量给出可追溯证据来源（来自稿件/引用/作者陈述）。

377 25

Explore

WILLOSCAR/research-units-pipeline-skills

protocol-writer

Write a systematic review protocol into `output/PROTOCOL.md` (databases, queries, inclusion/exclusion, time window, extraction fields). **Trigger**: protocol, PRISMA, systematic review, inclusion/exclusion, 检索式, 纳入排除. **Use when**: systematic review pipeline 的起点（C1），需要先锁定 protocol 再开始 screening/extraction。 **Skip if**: 不是做 systematic review（或 protocol 已经锁定且不允许修改）。 **Network**: none. **Guardrail**: protocol 必须包含可执行的检索与筛选规则；需要 HUMAN 签字后才能进入 screening。

377 25

Explore

WILLOSCAR/research-units-pipeline-skills

rubric-writer

Write a rubric-based peer review report (`output/REVIEW.md`) using extracted claims and evidence gaps (novelty/soundness/clarity/impact). **Trigger**: rubric review, referee report, peer review write-up, 审稿报告, REVIEW.md. **Use when**: peer-review pipeline 的最后阶段（C3），已有 `output/CLAIMS.md` + `output/MISSING_EVIDENCE.md`（以及可选 novelty matrix）。 **Skip if**: 上游产物未就绪（claims/evidence gaps 缺失）或你不打算输出完整审稿报告。 **Network**: none. **Guardrail**: 给可执行建议（actionable feedback），并覆盖 novelty/soundness/clarity/impact；避免泛泛而谈。

377 25

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Evidence Self-loop (C3/C4 fix → rebind → redraft)

Inputs

Outputs

Self-loop contract (what “fixing evidence” means)

Workflow (analysis-only)

What the report contains

Status semantics (unblock rules)

Routing matrix (symptom -> root cause -> upstream fix)

Interface with the writer self-loop (avoid writing around evidence)

What this skill does NOT do

Script

Quick Start

All Options

Examples

Recommended Agent Skills

thesis-compile-review

front-matter-writer

thesis-question-list

novelty-matrix

protocol-writer

rubric-writer