Agent skills
paragraph-curator

Agent skill

paragraph-curator

Structured paragraph curation for C5: **select -> evaluate -> subset -> fuse**, so drafts converge instead of only expanding. **Trigger**: paragraph curator, curation, select evaluate fuse, paragraph selection, 选段, 评价, 融合, 收敛, 去冗余. **Use when**: you are in C5, `sections/*.md` exist, and the writing loop drifts toward 'longer by accumulation' (repetition, redundant paragraphs, weak synthesis). **Skip if**: evidence packs are thin / `evidence-selfloop` is BLOCKED; or you are pre-C2 (NO PROSE). **Network**: none. **Guardrail**: do not invent facts; do not add/remove citation keys; do not move citations across subsections; keep section-level claims consistent with `output/ARGUMENT_SKELETON.md# Consistency Contract`.

View SKILL.md on GitHub Repository

Stars 377

Forks 25

Install this agent skill to your Project

npx add-skill https://github.com/WILLOSCAR/research-units-pipeline-skills/tree/main/.codex/skills/paragraph-curator

SKILL.md

Paragraph Curator (select -> evaluate -> subset -> fuse)

Purpose: turn “keep rewriting and getting longer” into a controlled convergence step.

This skill adds a decision layer between “draft paragraphs” and “polish voice”:

keep the best paragraphs
merge redundant ones
rewrite for clearer argument moves
expand only when coverage is missing (using existing evidence cards)

This is a content-structure pass (not a style pass). Run style-harmonizer and opener-variator after curation.

Inputs

Required:

sections/ (especially H3 bodies: sections/S<sub_id>.md)
outline/writer_context_packs.jsonl (what each H3 must cover + allowed citations)
output/ARGUMENT_SKELETON.md (single source of truth for terminology + premises)

Recommended:

output/SECTION_ARGUMENT_SUMMARIES.jsonl (paragraph moves + outputs)
output/SECTION_LOGIC_REPORT.md (paragraph linkage risks)
output/WRITER_SELFLOOP_TODO.md (style smells / scope/citation warnings)

Outputs

Updated sections/*.md (same filenames; body-only; no headings)
output/PARAGRAPH_CURATION_REPORT.md (short; PASS/FAIL + what changed)
Create sections/paragraphs_curated.refined.ok when done (empty file; pipeline contract signal)

What this skill optimizes (rubric)

You are not trying to “shorten”. You are trying to increase information density while keeping the section verifiable.

Score each paragraph on a simple 0-2 rubric:

Criterion	0 (bad)	1 (ok)	2 (good)
Coverage	does not match any required axis/card	matches one axis, thin	directly executes a must-use card/comparison
Novelty	repeats nearby content	partially redundant	adds a distinct comparison/insight
Move clarity	unclear what it does	move exists, weak output	clear move + reusable output
Consistency	premise/term drift vs skeleton	minor mismatch	fully aligned with Consistency Contract
Citation hygiene	uncited when it should be; cite-dump vibe	acceptable	citations are local and anchored (not just tail)
Fusion readiness	cannot merge; tangled	mergeable with edits	clean unit that can be fused or kept

Decision labels:

KEEP: keep mostly as-is
REWRITE: keep content, rewrite for clearer move/output
FUSE: merge with neighbor(s) and rewrite into one stronger paragraph
REPLACE: keep the slot, but rewrite using existing evidence cards (when coverage is missing)

Paragraph budget (profile-aware)

Default per-H3 target:

draft_profile=survey: 10-12 paragraphs
draft_profile=deep: 11-13 paragraphs

If you exceed the budget, do not delete content blindly. Prefer FUSE (merge redundancy) and make the fused paragraph denser.

Must-have coverage checklist (per H3)

Each H3 must contain at least:

1x Definition/Setup (only if this H3 introduces a new term/protocol field)
2x concrete Contrast paragraphs (A-vs-B comparisons; not just “many papers do...”)
1x Evaluation anchor paragraph (task + metric + constraint/budget/tool access; cite-backed)
1x cross-paper Synthesis paragraph (what generalizes, what does not; cite-backed)
1x Boundary/Failure paragraph (limitations; threats to validity; cite-backed when possible)
1x Local conclusion (a reusable takeaway used downstream)

If any item is missing, use REPLACE to write that paragraph from the writer context pack (do not invent new facts).

Workflow (minimal)

Pick the target set

Start with the H3 bodies listed in output/SECTION_LOGIC_REPORT.md, plus any H3 flagged in output/WRITER_SELFLOOP_TODO.md as repetitive/template-y, plus any H3 that keeps growing across edits.
Work file-by-file: each target is a concrete sections/S<sub_id>.md.

Build a paragraph inventory (scratch only; do not paste into the paper)

If output/SECTION_ARGUMENT_SUMMARIES.jsonl exists, use its per-paragraph moves/output as the first draft of your inventory, then reconcile with the actual text. For each paragraph, write one line:
P<i> :: move(s) -> output (1 sentence) :: citations (keys)

Apply the rubric and label each paragraph

Mark KEEP/REWRITE/FUSE/REPLACE.
If two adjacent paragraphs repeat the same axis, FUSE.
For any paragraph you plan to change (REWRITE/REPLACE/FUSE), draft 2-3 candidate rewrites in parallel (different angles: contrast-first / protocol-first / synthesis-first).
- Score candidates quickly with the rubric; keep one winner (or fuse two if they cover complementary axes).
- Keep citation keys unchanged while sampling; you are choosing surface form + structure, not changing the evidence set.

Construct the curated set

Use outline/writer_context_packs.jsonl to enforce must-have coverage (paragraph_plan/must_use/comparison_cards/limitation_hooks) without inventing new content.
Enforce the must-have coverage checklist.
Enforce the paragraph budget by fusing redundancy rather than deleting substance.

Fuse + rewrite (keep citation keys fixed) Rules that keep the pipeline stable:

Do not add/remove citation keys; when fusing, carry citations forward and re-anchor them to the right sentence.
Do not move citations across subsections.
Avoid adjacent citation blocks (e.g., [@a] [@b]) and duplicate keys in one block (e.g., [@a; @a]).
When fusing, it is often faster to write two fused candidates (one contrast-heavy, one synthesis-heavy) and pick the better one.

Write the report + marker

output/PARAGRAPH_CURATION_REPORT.md should be short and actionable:
- - Status: PASS|FAIL
- per H3: paragraph count before/after; what was fused; any remaining gaps
- (minimal) how many candidates you tried for the main rewrites (e.g., 2-3), so future passes can see whether this was a real selection step
Create sections/paragraphs_curated.refined.ok.

Routing rules

If you cannot fill a missing must-have paragraph without new evidence: stop and route upstream (evidence-selfloop / C3-C4). Do not pad.
If you feel forced to change a definition or evaluation premise: update output/ARGUMENT_SKELETON.md# Consistency Contract first, then rerun argument-selfloop.
If the only issue is surface cadence/openers: do not overwork curation; run style-harmonizer / opener-variator.

Done checklist

Each targeted H3 stays within its paragraph budget (survey 10-12; deep 11-13) without losing required moves.
Redundant paragraphs are fused into denser, clearer ones (not just deleted).
No citation keys were added/removed; citation shape is reader-facing (no adjacent blocks, no dup keys).
output/PARAGRAPH_CURATION_REPORT.md exists and is understandable.
sections/paragraphs_curated.refined.ok exists.

Script

Quick Start

python .codex/skills/paragraph-curator/scripts/run.py --workspace workspaces/<ws>

All Options

--workspace <dir> (required)
--unit-id <U###>
--inputs <semicolon-separated>
--outputs <semicolon-separated>
--checkpoint <C#>

Examples

Curate paragraphs in a survey workspace:
- python .codex/skills/paragraph-curator/scripts/run.py --workspace workspaces/survey-llm-agents

Maintainer

WILLOSCAR Core maintainer

Source details

Full Name: WILLOSCAR/research-units-pipeline-skills
Branch: main
Path in repo: .codex/skills/paragraph-curator
Topics: claude-code claude skills codex gpt pipeline research research-paper research-project research-tool tools units vibe vibe-coding vibecoding

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

WILLOSCAR/research-units-pipeline-skills

thesis-compile-review

对中文毕业论文进行编译、warning 分级、模板模式检查、数据与引用复查，并把问题回写成可继续迭代的 review checklist。 **Trigger**: 毕业论文编译检查, thesis compile review, warning 分级, 终稿复查, main.pdf 检查. **Use when**: 论文已经回写到 TeX 交付层，需要确认是否真正达到“可提交”的质量，而不是只做到能编译。 **Skip if**: 还处于中间层重构阶段，`chapters/*.tex` 尚未形成稳定交付稿。 **Network**: none. **Guardrail**: 不在这里重构章节主线；如果发现结构问题，明确回退到上游修复。

377 25

Explore

WILLOSCAR/research-units-pipeline-skills

front-matter-writer

Write the survey's front matter files (Abstract, Introduction, Related Work, Discussion, Conclusion) in paper voice, with high citation density and a single evidence-policy paragraph. **Trigger**: front matter writer, introduction writer, related work writer, abstract writer, discussion writer, conclusion writer, 引言, 相关工作, 摘要, 讨论, 结论. **Use when**: you are in C5 (prose allowed) and need the paper-like shell to stop the draft reading like stitched subsections. **Skip if**: `Approve C2` is missing in `DECISIONS.md`, or `citations/ref.bib` is missing. **Network**: none. **Guardrail**: no invented facts/citations; no pipeline jargon in final prose; no repeated evidence disclaimers; only use keys present in `citations/ref.bib`.

377 25

Explore

WILLOSCAR/research-units-pipeline-skills

thesis-question-list

维护中文毕业论文的 `codex_md/question_list.md`：把本轮问题、边界、优先级、协作方案和验收口径结构化，作为整条 thesis pipeline 的控制面。 **Trigger**: 毕业论文问题清单, thesis question list, 论文修改清单, 本轮目标, 结构问题梳理, review问题整理. **Use when**: 你已经有一批材料或上一轮 review 结果，需要明确这一轮到底修什么、不修什么，并给后续重构与编译复查提供统一入口。 **Skip if**: 当前只是在做一次性局部措辞修改，且没有形成新一轮结构/证据/编译问题。 **Network**: none. **Guardrail**: 不在这里写正文；不把问题单写成长篇散文；每条问题必须可执行、可验收。

377 25

Explore

WILLOSCAR/research-units-pipeline-skills

novelty-matrix

Create a novelty/prior-work matrix comparing the submission’s contributions against related work (overlaps vs deltas). **Trigger**: novelty matrix, prior-work matrix, overlap/delta, 相关工作对比, 新颖性矩阵. **Use when**: peer review 中评估 novelty/positioning，需要把贡献与相关工作逐项对齐并写出差异点证据。 **Skip if**: 缺少 claims（先跑 `claims-extractor`）或你不打算做新颖性定位分析。 **Network**: none (retrieval of additional related work is out-of-scope unless provided). **Guardrail**: 明确 overlap 与 delta；尽量给出可追溯证据来源（来自稿件/引用/作者陈述）。

377 25

Explore

WILLOSCAR/research-units-pipeline-skills

protocol-writer

Write a systematic review protocol into `output/PROTOCOL.md` (databases, queries, inclusion/exclusion, time window, extraction fields). **Trigger**: protocol, PRISMA, systematic review, inclusion/exclusion, 检索式, 纳入排除. **Use when**: systematic review pipeline 的起点（C1），需要先锁定 protocol 再开始 screening/extraction。 **Skip if**: 不是做 systematic review（或 protocol 已经锁定且不允许修改）。 **Network**: none. **Guardrail**: protocol 必须包含可执行的检索与筛选规则；需要 HUMAN 签字后才能进入 screening。

377 25

Explore

WILLOSCAR/research-units-pipeline-skills

rubric-writer

Write a rubric-based peer review report (`output/REVIEW.md`) using extracted claims and evidence gaps (novelty/soundness/clarity/impact). **Trigger**: rubric review, referee report, peer review write-up, 审稿报告, REVIEW.md. **Use when**: peer-review pipeline 的最后阶段（C3），已有 `output/CLAIMS.md` + `output/MISSING_EVIDENCE.md`（以及可选 novelty matrix）。 **Skip if**: 上游产物未就绪（claims/evidence gaps 缺失）或你不打算输出完整审稿报告。 **Network**: none. **Guardrail**: 给可执行建议（actionable feedback），并覆盖 novelty/soundness/clarity/impact；避免泛泛而谈。

377 25

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Paragraph Curator (select -> evaluate -> subset -> fuse)

Inputs

Outputs

What this skill optimizes (rubric)

Paragraph budget (profile-aware)

Must-have coverage checklist (per H3)

Workflow (minimal)

Routing rules

Done checklist

Script

Quick Start

All Options

Examples

Recommended Agent Skills

thesis-compile-review

front-matter-writer

thesis-question-list

novelty-matrix

protocol-writer

rubric-writer