Agent skill
ds-review
This skill should be used when running Phase 4 of the /ds workflow to review methodology, data quality, and statistical validity. Provides structured review checklists, confidence scoring, and issue identification for data analysis validation.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/ds-review
SKILL.md
Announce: "Using ds-review (Phase 4) to check methodology and quality."
Contents
- The Iron Law of DS Review
- Red Flags - STOP Immediately If You Think
- Review Focus Areas
- Confidence Scoring
- Common DS Issues to Check
- Required Output Structure
- Agent Invocation
- Quality Standards
Analysis Review
Single-pass review combining methodology correctness, data quality handling, and reproducibility checks. Uses confidence-based filtering.
You MUST only report issues with >= 80% confidence. This is not negotiable.
Before reporting ANY issue, you MUST:
- Verify it's not a false positive
- Verify it impacts results or reproducibility
- Assign a confidence score
- Only report if score >= 80
This applies even when:
- "This methodology looks suspicious"
- "I think this might introduce bias"
- "The approach seems unusual"
- "I would have done it differently"
STOP - If you catch yourself about to report a low-confidence issue, DISCARD IT. You're about to compromise the review's integrity. </EXTREMELY-IMPORTANT>
Red Flags - STOP Immediately If You Think:
| Thought | Why It's Wrong | Do Instead |
|---|---|---|
| "This looks wrong" | Your vague suspicion isn't evidence | Find concrete proof or discard |
| "I would do it differently" | Your style preference isn't a methodology error | Check if the approach is valid |
| "This might cause problems" | Your "might" means < 80% confidence | Find proof or discard |
| "Unusual approach" | Unusual isn't wrong—your bias toward familiar methods is clouding judgment | Verify the methodology is sound |
Review Focus Areas
Spec Compliance
- Verify all objectives from .claude/SPEC.md are addressed
- Confirm success criteria can be verified
- Check constraints were respected (especially replication requirements)
- Verify analysis answers the original question
Data Quality Handling
- Confirm missing values handled appropriately (not ignored)
- Verify duplicates addressed (documented if kept)
- Check outliers considered (handled or justified)
- Verify data types correct (dates parsed, numerics not strings)
- Confirm filtering logic documented with counts
Methodology Appropriateness
- Verify statistical methods appropriate for data type
- Check assumptions documented and verified (normality, independence, etc.)
- Confirm sample sizes adequate for conclusions
- Check multiple comparisons addressed if applicable
- Verify causality claims justified (or appropriately limited)
Reproducibility
- Verify random seeds set where needed
- Check package versions documented
- Verify data source/version documented
- Confirm all transformations traceable
- Verify results can be regenerated
Output Quality
- Verify visualizations labeled (title, axes, legend)
- Check numbers formatted appropriately (sig figs, units)
- Verify conclusions supported by evidence shown
- Confirm limitations acknowledged
Confidence Scoring
Rate each potential issue from 0-100:
| Score | Meaning |
|---|---|
| 0 | False positive or style preference |
| 25 | Might be real, methodology is unusual but valid |
| 50 | Real issue but minor impact on conclusions |
| 75 | Verified issue, impacts result interpretation |
| 100 | Certain error that invalidates conclusions |
CRITICAL: You MUST only report issues with confidence >= 80. If you report below this threshold, you're misrepresenting your certainty.
Common DS Issues to Check
Data Leakage
- Training data contains information from future
- Test data used in feature engineering
- Target variable used directly or indirectly in features
Selection Bias
- Filtering introduced systematic bias
- Survivorship bias in longitudinal data
- Non-random sampling not addressed
Statistical Errors
- Multiple testing without correction
- p-hacking or selective reporting
- Correlation interpreted as causation
- Inadequate sample size for claimed precision
Reproducibility Failures
- Random operations without seeds
- Undocumented data preprocessing
- Hard-coded paths or environment dependencies
- Missing package versions
Required Output Structure
markdown-output-structure: Template for review results with confidence-scored issues
## Analysis Review: [Analysis Name]
Reviewing: [files/notebooks being reviewed]
### Critical Issues (Confidence >= 90)
#### [Issue Title] (Confidence: XX)
**Location:** `file/path.py:line` or `notebook.ipynb cell N`
**Problem:** Clear description of the issue
**Impact:** How this affects results/conclusions
**Fix:**
```python
# Specific fix
Important Issues (Confidence 80-89)
[Same format as Critical Issues]
Data Quality Checklist
| Check | Status | Notes |
|---|---|---|
| Missing values | PASS/FAIL | [details] |
| Duplicates | PASS/FAIL | [details] |
| Outliers | PASS/FAIL | [details] |
| Type correctness | PASS/FAIL | [details] |
Methodology Checklist
| Check | Status | Notes |
|---|---|---|
| Appropriate for data | PASS/FAIL | [details] |
| Assumptions checked | PASS/FAIL | [details] |
| Sample size adequate | PASS/FAIL | [details] |
Reproducibility Checklist
| Check | Status | Notes |
|---|---|---|
| Seeds set | PASS/FAIL | [details] |
| Versions documented | PASS/FAIL | [details] |
| Data versioned | PASS/FAIL | [details] |
Summary
Verdict: APPROVED | CHANGES REQUIRED
[If APPROVED] The analysis meets quality standards. No methodology issues with confidence >= 80 detected.
[If CHANGES REQUIRED] X critical issues and Y important issues must be addressed before proceeding.
## Agent Invocation
task-agent-spawn: Spawn Task agent for structured analysis review
Spawn a Task agent to review the analysis:
Task(subagent_type="general-purpose"): "Review analysis against .claude/SPEC.md.
Execute single-pass review covering:
- Spec compliance - verify objectives met
- Data quality - confirm nulls, dupes, outliers handled
- Methodology - verify appropriate, assumptions checked
- Reproducibility - confirm seeds, versions, documentation
Confidence score each issue (0-100). Report only issues with >= 80 confidence. Return structured output per /ds-review format."
## Quality Standards
- **You must NOT report methodology preferences not backed by statistical principles.** Your opinion about how code should be written is not a review issue.
- **You must treat alternative valid approaches as non-issues (confidence = 0).** If the approach works correctly, don't report it.
- Ensure each reported issue is immediately actionable
- **If you're unsure, rate it below 80 confidence.** Uncertainty is not a reason to report—it's a reason to investigate more.
- Focus on what affects conclusions, not style. **STOP if you catch yourself criticizing coding style—that's not your role here.**
## Phase Complete
phase-transition: Invoke ds-verify after APPROVED review
After review is APPROVED, immediately invoke:
ds-verify: Verify analysis reproducibility and user acceptance
Read("${CLAUDE_PLUGIN_ROOT}/lib/skills/ds-verify/SKILL.md")
If CHANGES REQUIRED, return to `/ds-implement` to fix issues first.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
Didn't find tool you were looking for?