Agent skill
paragraph-curator
Structured paragraph curation for C5: **select -> evaluate -> subset -> fuse**, so drafts converge instead of only expanding. **Trigger**: paragraph curator, curation, select evaluate fuse, paragraph selection, 选段, 评价, 融合, 收敛, 去冗余. **Use when**: you are in C5, `sections/*.md` exist, and the writing loop drifts toward 'longer by accumulation' (repetition, redundant paragraphs, weak synthesis). **Skip if**: evidence packs are thin / `evidence-selfloop` is BLOCKED; or you are pre-C2 (NO PROSE). **Network**: none. **Guardrail**: do not invent facts; do not add/remove citation keys; do not move citations across subsections; keep section-level claims consistent with `output/ARGUMENT_SKELETON.md# Consistency Contract`.
Install this agent skill to your Project
npx add-skill https://github.com/WILLOSCAR/research-units-pipeline-skills/tree/main/.codex/skills/paragraph-curator
SKILL.md
Paragraph Curator (select -> evaluate -> subset -> fuse)
Purpose: turn “keep rewriting and getting longer” into a controlled convergence step.
This skill adds a decision layer between “draft paragraphs” and “polish voice”:
- keep the best paragraphs
- merge redundant ones
- rewrite for clearer argument moves
- expand only when coverage is missing (using existing evidence cards)
This is a content-structure pass (not a style pass). Run style-harmonizer and opener-variator after curation.
Inputs
Required:
sections/(especially H3 bodies:sections/S<sub_id>.md)outline/writer_context_packs.jsonl(what each H3 must cover + allowed citations)output/ARGUMENT_SKELETON.md(single source of truth for terminology + premises)
Recommended:
output/SECTION_ARGUMENT_SUMMARIES.jsonl(paragraph moves + outputs)output/SECTION_LOGIC_REPORT.md(paragraph linkage risks)output/WRITER_SELFLOOP_TODO.md(style smells / scope/citation warnings)
Outputs
- Updated
sections/*.md(same filenames; body-only; no headings) output/PARAGRAPH_CURATION_REPORT.md(short; PASS/FAIL + what changed)- Create
sections/paragraphs_curated.refined.okwhen done (empty file; pipeline contract signal)
What this skill optimizes (rubric)
You are not trying to “shorten”. You are trying to increase information density while keeping the section verifiable.
Score each paragraph on a simple 0-2 rubric:
| Criterion | 0 (bad) | 1 (ok) | 2 (good) |
|---|---|---|---|
| Coverage | does not match any required axis/card | matches one axis, thin | directly executes a must-use card/comparison |
| Novelty | repeats nearby content | partially redundant | adds a distinct comparison/insight |
| Move clarity | unclear what it does | move exists, weak output | clear move + reusable output |
| Consistency | premise/term drift vs skeleton | minor mismatch | fully aligned with Consistency Contract |
| Citation hygiene | uncited when it should be; cite-dump vibe | acceptable | citations are local and anchored (not just tail) |
| Fusion readiness | cannot merge; tangled | mergeable with edits | clean unit that can be fused or kept |
Decision labels:
KEEP: keep mostly as-isREWRITE: keep content, rewrite for clearer move/outputFUSE: merge with neighbor(s) and rewrite into one stronger paragraphREPLACE: keep the slot, but rewrite using existing evidence cards (when coverage is missing)
Paragraph budget (profile-aware)
Default per-H3 target:
draft_profile=survey: 10-12 paragraphsdraft_profile=deep: 11-13 paragraphs
If you exceed the budget, do not delete content blindly. Prefer FUSE (merge redundancy) and make the fused paragraph denser.
Must-have coverage checklist (per H3)
Each H3 must contain at least:
- 1x
Definition/Setup(only if this H3 introduces a new term/protocol field) - 2x concrete
Contrastparagraphs (A-vs-B comparisons; not just “many papers do...”) - 1x
Evaluation anchorparagraph (task + metric + constraint/budget/tool access; cite-backed) - 1x cross-paper
Synthesisparagraph (what generalizes, what does not; cite-backed) - 1x
Boundary/Failureparagraph (limitations; threats to validity; cite-backed when possible) - 1x
Local conclusion(a reusable takeaway used downstream)
If any item is missing, use REPLACE to write that paragraph from the writer context pack (do not invent new facts).
Workflow (minimal)
- Pick the target set
- Start with the H3 bodies listed in
output/SECTION_LOGIC_REPORT.md, plus any H3 flagged inoutput/WRITER_SELFLOOP_TODO.mdas repetitive/template-y, plus any H3 that keeps growing across edits. - Work file-by-file: each target is a concrete
sections/S<sub_id>.md.
- Build a paragraph inventory (scratch only; do not paste into the paper)
- If
output/SECTION_ARGUMENT_SUMMARIES.jsonlexists, use its per-paragraphmoves/outputas the first draft of your inventory, then reconcile with the actual text. For each paragraph, write one line: P<i> :: move(s) -> output (1 sentence) :: citations (keys)
- Apply the rubric and label each paragraph
- Mark
KEEP/REWRITE/FUSE/REPLACE. - If two adjacent paragraphs repeat the same axis,
FUSE. - For any paragraph you plan to change (
REWRITE/REPLACE/FUSE), draft 2-3 candidate rewrites in parallel (different angles: contrast-first / protocol-first / synthesis-first).- Score candidates quickly with the rubric; keep one winner (or fuse two if they cover complementary axes).
- Keep citation keys unchanged while sampling; you are choosing surface form + structure, not changing the evidence set.
- Construct the curated set
- Use
outline/writer_context_packs.jsonlto enforce must-have coverage (paragraph_plan/must_use/comparison_cards/limitation_hooks) without inventing new content. - Enforce the must-have coverage checklist.
- Enforce the paragraph budget by fusing redundancy rather than deleting substance.
- Fuse + rewrite (keep citation keys fixed) Rules that keep the pipeline stable:
- Do not add/remove citation keys; when fusing, carry citations forward and re-anchor them to the right sentence.
- Do not move citations across subsections.
- Avoid adjacent citation blocks (e.g.,
[@a] [@b]) and duplicate keys in one block (e.g.,[@a; @a]). - When fusing, it is often faster to write two fused candidates (one contrast-heavy, one synthesis-heavy) and pick the better one.
- Write the report + marker
output/PARAGRAPH_CURATION_REPORT.mdshould be short and actionable:- Status: PASS|FAIL- per H3: paragraph count before/after; what was fused; any remaining gaps
- (minimal) how many candidates you tried for the main rewrites (e.g., 2-3), so future passes can see whether this was a real selection step
- Create
sections/paragraphs_curated.refined.ok.
Routing rules
- If you cannot fill a missing must-have paragraph without new evidence: stop and route upstream (
evidence-selfloop/ C3-C4). Do not pad. - If you feel forced to change a definition or evaluation premise: update
output/ARGUMENT_SKELETON.md# Consistency Contractfirst, then rerunargument-selfloop. - If the only issue is surface cadence/openers: do not overwork curation; run
style-harmonizer/opener-variator.
Done checklist
- Each targeted H3 stays within its paragraph budget (survey 10-12; deep 11-13) without losing required moves.
- Redundant paragraphs are fused into denser, clearer ones (not just deleted).
- No citation keys were added/removed; citation shape is reader-facing (no adjacent blocks, no dup keys).
-
output/PARAGRAPH_CURATION_REPORT.mdexists and is understandable. -
sections/paragraphs_curated.refined.okexists.
Script
Quick Start
python .codex/skills/paragraph-curator/scripts/run.py --workspace workspaces/<ws>
All Options
--workspace <dir>(required)--unit-id <U###>--inputs <semicolon-separated>--outputs <semicolon-separated>--checkpoint <C#>
Examples
- Curate paragraphs in a survey workspace:
python .codex/skills/paragraph-curator/scripts/run.py --workspace workspaces/survey-llm-agents
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
thesis-compile-review
对中文毕业论文进行编译、warning 分级、模板模式检查、数据与引用复查,并把问题回写成可继续迭代的 review checklist。 **Trigger**: 毕业论文编译检查, thesis compile review, warning 分级, 终稿复查, main.pdf 检查. **Use when**: 论文已经回写到 TeX 交付层,需要确认是否真正达到“可提交”的质量,而不是只做到能编译。 **Skip if**: 还处于中间层重构阶段,`chapters/*.tex` 尚未形成稳定交付稿。 **Network**: none. **Guardrail**: 不在这里重构章节主线;如果发现结构问题,明确回退到上游修复。
front-matter-writer
Write the survey's front matter files (Abstract, Introduction, Related Work, Discussion, Conclusion) in paper voice, with high citation density and a single evidence-policy paragraph. **Trigger**: front matter writer, introduction writer, related work writer, abstract writer, discussion writer, conclusion writer, 引言, 相关工作, 摘要, 讨论, 结论. **Use when**: you are in C5 (prose allowed) and need the paper-like shell to stop the draft reading like stitched subsections. **Skip if**: `Approve C2` is missing in `DECISIONS.md`, or `citations/ref.bib` is missing. **Network**: none. **Guardrail**: no invented facts/citations; no pipeline jargon in final prose; no repeated evidence disclaimers; only use keys present in `citations/ref.bib`.
thesis-question-list
维护中文毕业论文的 `codex_md/question_list.md`:把本轮问题、边界、优先级、协作方案和验收口径结构化,作为整条 thesis pipeline 的控制面。 **Trigger**: 毕业论文问题清单, thesis question list, 论文修改清单, 本轮目标, 结构问题梳理, review问题整理. **Use when**: 你已经有一批材料或上一轮 review 结果,需要明确这一轮到底修什么、不修什么,并给后续重构与编译复查提供统一入口。 **Skip if**: 当前只是在做一次性局部措辞修改,且没有形成新一轮结构/证据/编译问题。 **Network**: none. **Guardrail**: 不在这里写正文;不把问题单写成长篇散文;每条问题必须可执行、可验收。
novelty-matrix
Create a novelty/prior-work matrix comparing the submission’s contributions against related work (overlaps vs deltas). **Trigger**: novelty matrix, prior-work matrix, overlap/delta, 相关工作对比, 新颖性矩阵. **Use when**: peer review 中评估 novelty/positioning,需要把贡献与相关工作逐项对齐并写出差异点证据。 **Skip if**: 缺少 claims(先跑 `claims-extractor`)或你不打算做新颖性定位分析。 **Network**: none (retrieval of additional related work is out-of-scope unless provided). **Guardrail**: 明确 overlap 与 delta;尽量给出可追溯证据来源(来自稿件/引用/作者陈述)。
protocol-writer
Write a systematic review protocol into `output/PROTOCOL.md` (databases, queries, inclusion/exclusion, time window, extraction fields). **Trigger**: protocol, PRISMA, systematic review, inclusion/exclusion, 检索式, 纳入排除. **Use when**: systematic review pipeline 的起点(C1),需要先锁定 protocol 再开始 screening/extraction。 **Skip if**: 不是做 systematic review(或 protocol 已经锁定且不允许修改)。 **Network**: none. **Guardrail**: protocol 必须包含可执行的检索与筛选规则;需要 HUMAN 签字后才能进入 screening。
rubric-writer
Write a rubric-based peer review report (`output/REVIEW.md`) using extracted claims and evidence gaps (novelty/soundness/clarity/impact). **Trigger**: rubric review, referee report, peer review write-up, 审稿报告, REVIEW.md. **Use when**: peer-review pipeline 的最后阶段(C3),已有 `output/CLAIMS.md` + `output/MISSING_EVIDENCE.md`(以及可选 novelty matrix)。 **Skip if**: 上游产物未就绪(claims/evidence gaps 缺失)或你不打算输出完整审稿报告。 **Network**: none. **Guardrail**: 给可执行建议(actionable feedback),并覆盖 novelty/soundness/clarity/impact;避免泛泛而谈。
Didn't find tool you were looking for?