Agent skill
ds-delegate
Subagent delegation for data analysis. Dispatches fresh Task agents with output-first verification.
Install this agent skill to your Project
npx add-skill https://github.com/edwinhu/workflows/tree/main/skills/ds-delegate
SKILL.md
Contents
- The Iron Law of Delegation
- Core Principle
- The Process
- Drive-Aligned Framing
- Rationalization Prevention
YOU MUST route EVERY ANALYSIS STEP THROUGH A TASK AGENT. This is not negotiable.
You MUST NOT:
- Write analysis code directly
- Run "quick" data checks
- Edit notebooks or scripts
- Make "just this one plot"
If you're about to write analysis code in main chat, STOP. Spawn a Task agent instead. </EXTREMELY-IMPORTANT>
Core Principle
Fresh subagent per task + output-first verification = reliable analysis
- Analyst subagent does the work
- Must produce visible output at each step
- Methodology reviewer checks approach
- Loop until output verified
When to Use
Called by ds-implement for each task in PLAN.md. Don't invoke directly.
The Process
For each task:
1. Dispatch analyst subagent
- If questions → answer, re-dispatch
- Implements with output-first protocol
2. Verify outputs are present and reasonable
3. Dispatch methodology reviewer (if complex)
4. Mark task complete, log to LEARNINGS.md
Task Type Detection
Each task in PLAN.md should have a type field. Detect and route accordingly:
| Task Type | Agent | Constraints | Example Tasks |
|---|---|---|---|
engineering |
workflows:ds-engineer |
ds-engineering-constraints.md index + atomic E1-E5 files | ETL, merge, clean, transform, pipeline, schema, join |
analysis |
workflows:ds-analyst |
ds-analysis-constraints.md index + atomic A1-A7 files | regression, test, model, visualize, estimate, summarize |
Detection heuristic (when type field is missing):
| Task contains these keywords | Type |
|---|---|
| merge, join, clean, ETL, transform, pipeline, ingest, schema, deduplicate, normalize | engineering |
| regression, estimate, test, model, plot, chart, visualize, summarize, correlate, panel | analysis |
| ambiguous | Default to analysis (safer — analysis constraints are stricter) |
Step 1: Dispatch Analyst/Engineer
Pattern: Use structured delegation template from references/delegation-template.md
Every delegation MUST include:
- TASK - What to analyze
- EXPECTED OUTCOME - Success criteria
- REQUIRED SKILLS - Statistical/ML methods needed
- REQUIRED TOOLS - Data access and analysis tools
- MUST DO - Output-first verification
- MUST NOT DO - Methodology violations
- CONTEXT - Data sources and previous work
- VERIFICATION - Output requirements
Use this Task invocation (fill in brackets). Route based on task type detected above:
All paths below are relative to this skill's base directory.
For analysis tasks:
Task(subagent_type="workflows:ds-analyst", prompt="""
# TASK
Analyze: [TASK NAME]
## EXPECTED OUTCOME
You will have successfully completed this task when:
- [ ] [Specific analysis output 1]
- [ ] [Specific analysis output 2]
- [ ] Output-first verification at each step
- [ ] Results documented with evidence
## REQUIRED SKILLS
This task requires:
- [Statistical method]: [Why needed]
- [Programming language]: Data manipulation
- Output-first verification (mandatory)
- SQL reference: Read `../ds-delegate/references/sql-patterns.md` for dialect-specific patterns
- Data quality checks: Read `../ds-implement/references/ds-checks.md` for DQ1-DQ6 verification patterns (mandatory)
- Analysis constraints: Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-analysis-constraints.md` for the constraint index, then load:
Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-robustness-checks.md`
Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-standard-error-spec.md`
Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-visualization-integrity.md`
- Analysis conventions: Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-common-conventions.md` for the convention index, then load:
Read `${CLAUDE_SKILL_DIR}/../../references/conventions/ds-statistical-validity.md`
Read `${CLAUDE_SKILL_DIR}/../../references/conventions/ds-p-hacking-prevention.md`
Read `${CLAUDE_SKILL_DIR}/../../references/conventions/ds-sample-selection.md`
Read `${CLAUDE_SKILL_DIR}/../../references/conventions/ds-deviation-rules-analysis.md`
## REQUIRED TOOLS
You will need:
- Read: Load datasets and existing code
- Write: Create analysis scripts/notebooks
- Bash: Run analysis and verify outputs
**Tools denied:** None (full analysis access)
## MUST DO
- [ ] Print state BEFORE each operation (shape, head)
- [ ] Print state AFTER each operation (nulls, sample)
- [ ] Verify outputs are reasonable at each step
- [ ] Document methodology decisions
## MUST NOT DO
- ❌ Skip verification outputs
- ❌ Proceed with questionable data without flagging
- ❌ Guess on methodology (ask if unclear)
- ❌ Claim completion without visible outputs
## CONTEXT
### Task Description
[PASTE FULL TASK TEXT FROM PLAN.md]
### Analysis Context
- Analysis objective: [from SPEC.md]
- Data sources: [list with paths]
- Previous steps: [summary from LEARNINGS.md]
## Output-First Protocol (MANDATORY)
For EVERY operation:
1. Print state BEFORE (shape, head)
2. Execute operation
3. Print state AFTER (shape, nulls, sample)
4. Verify output is reasonable
Example:
```python
print(f"Before: {df.shape}")
df = df.merge(other, on='key')
print(f"After: {df.shape}")
print(f"Nulls introduced: {df.isnull().sum().sum()}")
df.head()
Required Outputs by Operation
| Operation | Required Output |
|---|---|
| Load data | shape, dtypes, head() |
| Filter | shape before/after, % removed |
| Merge/Join | shape, null check, sample |
| Groupby | result shape, sample groups |
| Model fit | metrics, convergence |
If Unclear
Ask questions BEFORE implementing. Don't guess on methodology.
Output
Report: what you did, key outputs observed, any data quality issues found. """)
**For `engineering` tasks:**
Task(subagent_type="workflows:ds-engineer", prompt="""
TASK
Engineer: [TASK NAME]
EXPECTED OUTCOME
You will have successfully completed this task when:
- [Specific engineering output 1]
- [Specific engineering output 2]
- Output-first verification at each step
- Results documented with evidence
REQUIRED SKILLS
This task requires:
- [Engineering method]: [Why needed]
- [Programming language]: Data manipulation
- Output-first verification (mandatory)
- SQL reference: Read
../ds-delegate/references/sql-patterns.mdfor dialect-specific patterns - Data quality checks: Read
../ds-implement/references/ds-checks.mdfor DQ1-DQ6 verification patterns (mandatory) - Engineering constraints: Read
${CLAUDE_SKILL_DIR}/../../references/constraints/ds-engineering-constraints.mdfor the constraint index, then load: Read${CLAUDE_SKILL_DIR}/../../references/constraints/ds-determinism.mdRead${CLAUDE_SKILL_DIR}/../../references/constraints/ds-schema-contracts.mdRead${CLAUDE_SKILL_DIR}/../../references/constraints/ds-join-audits.mdRead${CLAUDE_SKILL_DIR}/../../references/constraints/ds-idempotency.mdRead${CLAUDE_SKILL_DIR}/../../references/constraints/ds-error-handling.md
REQUIRED TOOLS
You will need:
- Read: Load datasets and existing code
- Write: Create ETL scripts/pipelines
- Bash: Run transformations and verify outputs
Tools denied: None (full engineering access)
MUST DO
- Print state BEFORE each operation (shape, head)
- Print state AFTER each operation (nulls, sample)
- Verify schema contracts at each step
- Validate determinism (same input → same output)
- Check join key uniqueness before merging
- Document pipeline decisions
MUST NOT DO
- ❌ Skip verification outputs
- ❌ Proceed with non-deterministic transforms without flagging
- ❌ Introduce silent data loss (row drops without logging)
- ❌ Claim completion without visible outputs
CONTEXT
Task Description
[PASTE FULL TASK TEXT FROM PLAN.md]
Engineering Context
- Pipeline objective: [from SPEC.md]
- Data sources: [list with paths]
- Previous steps: [summary from LEARNINGS.md]
Output-First Protocol (MANDATORY)
For EVERY operation:
- Print state BEFORE (shape, head)
- Execute operation
- Print state AFTER (shape, nulls, sample)
- Verify output is reasonable
Example:
print(f"Before: {df.shape}")
df = df.merge(other, on='key')
print(f"After: {df.shape}")
print(f"Nulls introduced: {df.isnull().sum().sum()}")
df.head()
Required Outputs by Operation
| Operation | Required Output |
|---|---|
| Load data | shape, dtypes, head() |
| Filter | shape before/after, % removed |
| Merge/Join | shape, null check, key uniqueness |
| Transform | before/after sample, determinism check |
| Pipeline step | input shape → output shape, schema validation |
If Unclear
Ask questions BEFORE implementing. Don't guess on architecture.
Output
Report: what you did, key outputs observed, any data quality or schema issues found. """)
**If agent asks questions:** Answer clearly, especially about methodology choices (analysis) or architecture decisions (engineering).
**If agent completes task:** Verify outputs, then proceed or review.
## Step 2: Verify Outputs (Post-Subagent Boundary)
<EXTREMELY-IMPORTANT>
**After analyst returns, you are at the post-subagent boundary. Constraints C5 from ds-common-constraints.md apply.**
**ALLOWED (Verification):**
- [ ] Read the analyst's returned report/summary
- [ ] Check LEARNINGS.md for output documentation
- [ ] Confirm output files exist (`ls -la`)
- [ ] Compare task counts (expected vs actual)
**FORBIDDEN (Investigation):**
- ❌ Read project source code, notebooks, or data files
- ❌ Run analysis code to "confirm" results
- ❌ Query databases or inspect intermediate files
- ❌ Grep/Glob project files
**If the analyst's report shows problems, re-dispatch a Task agent. Do NOT investigate yourself.**
</EXTREMELY-IMPORTANT>
Upon verification failure, re-dispatch analyst with specific fix instructions.
## Step 3: Dispatch Methodology Reviewer (Complex Tasks)
For statistical analysis, modeling, or methodology-sensitive tasks, dispatch a methodology reviewer. **Tailor the review checklist to the task type:**
Task(subagent_type="general-purpose", allowed_tools=["Read", "Glob", "Grep", "Bash(read-only)"], prompt=""" Review methodology for: [TASK NAME] Task type: [engineering | analysis]
What Was Done
[SUMMARY FROM ANALYST/ENGINEER OUTPUT]
Original Requirements
[FROM SPEC.md - especially any replication requirements]
Tool Restrictions: The methodology reviewer is READ-ONLY. It reads code, verifies outputs, and returns a verdict. It MUST NOT use Write or Edit.
CRITICAL: Do Not Trust the Report
The agent may have:
- Reported success without actually running the code
- Cherry-picked output that looks correct
- Glossed over data quality issues
- Made methodology choices without justification
DO:
- Read the actual code or notebook cells
- Verify outputs exist and match claims
- Check for silent failures (empty DataFrames, all nulls)
- Confirm assumptions were checked
Review Checklist — Engineering Tasks
Use this checklist when task type is engineering:
- Are schema contracts validated at each pipeline stage?
- Is the pipeline deterministic (same input → same output)?
- Is the transform idempotent (safe to re-run)?
- Are error handling and edge cases covered (empty inputs, missing keys)?
- Are join keys validated for uniqueness before merge?
- Is data loss accounted for (row counts before/after, logged drops)?
Review Checklist — Analysis Tasks
Use this checklist when task type is analysis:
- Is the statistical method appropriate for the data type?
- Are assumptions documented and checked?
- Is sample size adequate for conclusions?
- Is the specification justified (why these controls, why this functional form)?
- Are robustness checks included (alternative specs, subsamples)?
- Is the standard error specification appropriate (clustered, HC, bootstrap)?
- Are there data leakage or p-hacking concerns?
- Is the approach reproducible (seeds, versions)?
Confidence Scoring
Rate each issue 0-100. Only report issues >= 80 confidence.
Output Format
- APPROVED: Methodology sound (after verifying code/outputs yourself)
- ISSUES: List concerns with confidence scores and file:line references """)
## Step 4: Log to LEARNINGS.md
Append to `.planning/LEARNINGS.md` after each task:
```markdown
## Task N: [Name] - COMPLETE
**Input:** [describe input state]
**Operation:** [what was done]
**Output:**
- Shape: [final shape]
- Key findings: [observations]
**Verification:**
- [how you confirmed it worked]
**Next:** [what comes next]
Gate: Exit Delegation (Per-Task)
Checkpoint type: human-verify (task completion is machine-verifiable)
Before marking any task as complete, execute this gate:
1. IDENTIFY → What proves this task is done?
- Task agent returned output (not just "done")
- Output matches PLAN.md expected output for this task
2. RUN → Read the agent's actual output (not just the summary)
3. READ → Verify: shapes reasonable? No unexpected nulls? Sample looks correct?
4. VERIFY → If statistical task: methodology reviewer approved
5. CLAIM → Only log "Task N: COMPLETE" in LEARNINGS.md if ALL checks pass
If agent returned no visible output, this gate FAILS. Re-dispatch with explicit output requirements.
Skipping output verification is NOT HELPFUL — unverified results lead the user to act on wrong analysis.
Drive-Aligned Framing
When you say "Step complete", you are asserting:
- A Task agent ran the analysis
- Output was visible and verified by you
- You personally checked it (not just trusting the agent's word)
- Methodology reviewer approved (for statistical tasks)
If ANY of these didn't happen, you are not "summarizing" — you are being anti-helpful by giving the user false confidence in unverified work.
Unverified claims waste the user's time and corrupt their research. Verified "investigating" protects their work. </EXTREMELY-IMPORTANT>
Rationalization Prevention
Recognize these thoughts as signals to stop and delegate instead:
| Excuse | Reality | Do Instead |
|---|---|---|
| "I'll just check the shape quickly" | You'll skip the output-first protocol | Delegate to Task agent with full verification |
| "It's just a simple merge" | Your merges fail silently | Delegate with verification requirements |
| "I already know this data" | Your knowing ≠ verified | Delegate anyway with output-first protocol |
| "The subagent will be slower" | Wrong results are slower than slow results | Delegate — correctness beats speed |
| "Just this one plot" | You're hiding data issues with one plot | Delegate with full output requirements |
| "User wants results fast" | They want CORRECT results | Delegate — optimize for correctness, not speed |
| "Skip methodology review, it's standard" | Your "standard" assumptions often fail | Dispatch methodology reviewer anyway |
| "Output looked reasonable" | "Looked reasonable" ≠ verified | Check the actual numbers against expectations |
Drive-Aligned Framing
| Shortcut | Consequence |
|---|---|
| Delegating without context | You spawned a task agent without SPEC/PLAN context. It guesses wrong — your delegation created confusion. |
| Skipping verification of agent output | You trusted the agent's claim of completion. The output is wrong — your trust was negligence. |
Delete & Restart
If you wrote analysis code in the main chat instead of delegating to a task agent, DELETE it immediately and dispatch a Task agent.
Code written in main chat is contaminated by orchestrator context, skips the output-first protocol, and bypasses methodology review. It cannot be salvaged — it must be replaced.
Red Flags
If you catch yourself thinking these, STOP immediately:
- "I can skip output verification this time"
- "I'll chain operations together, it's fine"
- "Unexpected nulls are probably okay"
- "Methodology review takes too long, skip it"
- "The merge probably worked"
- "Output-first protocol is overkill here"
- "I'll just summarize PLAN.md for the analyst" (STOP—provide full text)
When analyst produces no visible output:
- You must re-dispatch with explicit output requirements
- Treat this as a hard failure, not something to work around
When analyst fails a task:
- You must dispatch a fix subagent with specific instructions
- Don't fix it yourself in main chat—you'll pollute context and hide the real issue
Example Flow
Me: Implementing Task 1: Load and clean transaction data
[Dispatch analyst with full task text]
Analyst:
- Loaded transactions.csv: (50000, 12)
- Found 5% nulls in amount column
- "Should I drop or impute nulls?"
Me: "Impute with median, flag imputed rows"
[Re-dispatch with answer]
Analyst:
- Imputed 2,500 rows with median ($45.50)
- Added is_imputed flag column
- Final shape: (50000, 13)
- Sample output: [shows head with flag]
[Verify: shapes match, flag exists, no unexpected changes]
[Log to LEARNINGS.md]
[Mark Task 1 complete, move to Task 2]
Model Tier Hints
When dispatching subagents, match model capability to task complexity. This is advisory -- Claude Code doesn't yet support model routing -- but documents intent for cost-aware delegation.
| Task Complexity | Model Tier | Signals | Example |
|---|---|---|---|
| Mechanical | Cheapest capable | Data loading, simple filtering, descriptive stats, file format conversion | "Load CSV and compute summary statistics" |
| Integration | Standard | Merges/joins across sources, aggregations, visualization, data reshaping | "Merge transaction and customer tables, create pivot summary" |
| Architecture/Review | Most capable | Feature engineering strategy, model selection, statistical assumption validation, methodology review | "Select appropriate model family and validate distributional assumptions" |
Complexity signals:
- Reads/writes 1 file with clear spec -> mechanical
- Joins/reshapes across sources or produces visualizations -> integration
- Requires statistical judgment or methodology design -> architecture
When in doubt, use the standard tier. Over-allocating is wasteful; under-allocating produces poor results.
Integration
This skill is invoked by ds-implement during the output-first implementation phase.
After all tasks complete, ds-implement proceeds to ds-review.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
audit-fix-loop
This skill should be used when the user asks to 'iteratively improve', 'audit and fix', 'hill-climb quality', 'grade and improve', 'score and fix', 'audit loop', 'quality loop', or needs structured iterative improvement of an artifact using scored independent audits. Also use when the user invokes a ralph loop for quality improvement rather than task completion.
ds-spec-reviewer
Internal skill used by ds-brainstorm at Phase 1 exit gate. Dispatches a reviewer subagent to verify SPEC.md completeness before planning. NOT user-facing.
pptx-render
Use when the user asks to "render pptx", "show pptx slide", "compare with pptx", "pptx to image", "export pptx slide", "original slide", "show me the original", "what does the pptx look like", or needs to extract a specific PPTX slide's content for visual comparison.
obsidian-organize
Organize Obsidian notes according to clawd's preferences. Use when user asks to "organize notes", "move notes to right folder", "clean up vault", "tidy vault", "file this note", or when creating new notes in the Obsidian vault. Also use when moving, renaming, or categorizing notes, or when the vault root has stray files.
dev-verify
This skill should be used when the user asks to 'verify completion', 'check that tests pass', 'confirm feature works', or REQUIRED Phase 7 of /dev workflow (final). Enforces fresh runtime evidence before claiming completion.
dev
This skill should be used when the user asks to 'start a feature', 'build a feature', 'implement a feature', 'develop', 'new feature', or needs the full 7-phase development workflow with TDD enforcement.
Didn't find tool you were looking for?