Agent skill
nw-spike
Runs a timeboxed spike to validate one core assumption before DESIGN. Use after DISCUSS when the feature involves a new mechanism, performance requirement, or external integration.
Install this agent skill to your Project
npx add-skill https://github.com/nWave-ai/nWave/tree/main/nWave/skills/nw-spike
SKILL.md
NW-SPIKE: Timeboxed Assumption Validation
Wave: SPIKE (between DISCUSS and DESIGN) | Agent: Attila (nw-software-crafter) | Command: /nw-spike
Overview
Execute a timeboxed spike (max 1 hour) to validate a single core assumption before investing in architecture design. Produces throwaway code and permanent findings. The spike answers: does the mechanism work, does it meet the performance budget, and what did we assume wrong?
Skip Check
Before running, verify the spike is needed. If ALL answers are "no", skip and proceed to DESIGN:
- Is there a new mechanism never tried before in this codebase?
- Is there a performance requirement that cannot be validated by reasoning alone?
- Is there an external integration with unknown behavior?
If skipping: tell the user and recommend /nw-design directly.
Prior Wave Consultation
- DISCUSS artifacts: Read
docs/feature/{feature-id}/discuss/(required)user-stories.md-- scope and acceptance criteriawave-decisions.md-- constraints and assumptions to test
- DIVERGE artifacts: Read
docs/feature/{feature-id}/diverge/recommendation.md(if present)
Interactive Decision Points
Decision 1: Spike Scope
Question: What is the ONE assumption you need to validate? Examples:
- "Can we parse pytest output reliably in <5 seconds?"
- "Can the CEL library evaluate 100 expressions in <1 second?"
- "Can we write to .git/hooks/ from a subprocess without corruption?"
Decision 2: Performance Budget
Question: What is the timing constraint? (Enter "none" if mechanism validation only) Examples:
- "<5 seconds end-to-end"
- "<100ms per operation"
- "Handle 10K items without OOM"
Agent Invocation
@nw-software-crafter
SKILL_LOADING: Before starting, load your spike methodology skill at ~/.claude/skills/nw-spike-methodology/SKILL.md using the Read tool.
Execute spike for "{feature-description}".
Spike question: {Decision 1 answer} Performance budget: {Decision 2 answer}
Rules:
- Code goes in
/tmp/spike_{feature_id}/. Never insrc/. - Max 1 hour. No tests, no types, no error handling, no abstractions.
- One file preferred. Two files maximum.
- Use
time.perf_counter()for timing. - Print results to stdout.
After spike completes:
- Write findings to
docs/feature/{feature-id}/spike/findings.md - Delete the spike code from
/tmp/ - Report the binary verdict: WORKS or DOESN'T WORK
Progress Tracking
The invoked agent MUST create a task list from its workflow phases at the start of execution using TaskCreate. Each phase becomes a task with the gate condition as completion criterion. Mark tasks in_progress when starting each phase and completed when the gate passes.
Success Criteria
- Exactly one assumption tested (not two, not zero)
- Spike code lives in
/tmp/, never insrc/ - Completed within 1 hour (or escalated with "BIGGER THAN EXPECTED")
-
findings.mdwritten with binary verdict, timing, and edge cases - Spike code deleted after findings written
- Design implications documented (what was assumed wrong)
Next Wave
Handoff To: nw-solution-architect (DESIGN wave)
Deliverables: docs/feature/{feature-id}/spike/findings.md
Design reads findings before starting -- spike results override any prior assumptions.
Wave Decisions Summary
Before completing SPIKE, produce docs/feature/{feature-id}/spike/wave-decisions.md:
# SPIKE Decisions -- {feature-id}
## Assumption Tested
- {the one question}
## Verdict
- {WORKS / DOESN'T WORK}: {one-line summary}
## Design Implications
- {what DESIGN must account for based on spike results}
## Constraints Discovered
- {any new constraints from edge cases}
Examples
Example 1: Performance spike
/nw-spike "wave-matrix -- derive feature status from pytest + filesystem"
Spike question: "Can we collect pytest markers + parse filesystem state in <5 seconds?"
Agent writes 50-line script in /tmp/spike_wave_matrix/, discovers pytest collection takes 44 seconds (budget blown). Findings document the correct approach (cache + collect-only). Code deleted. DESIGN proceeds with cache-first architecture.
Example 2: Integration spike
/nw-spike "cel-policy-engine -- evaluate access control expressions"
Spike question: "Can cel-python evaluate 100 policy expressions in <1 second?" Agent installs cel-python, writes evaluation loop, measures 23ms for 100 expressions. Verdict: WORKS. Edge case: nested map access syntax differs from Go CEL. Findings inform DESIGN's expression schema.
Example 3: Mechanism spike
/nw-spike "git-hook-wiring -- install hooks via subprocess"
Spike question: "Can we write to .git/hooks/ from a Python subprocess without file corruption?" Agent writes hook installer, tests with concurrent access. Verdict: WORKS but needs file locking. Edge case: Windows line endings corrupt hook on WSL. Findings feed into DESIGN's cross-platform strategy.
Expected Outputs
docs/feature/{feature-id}/spike/
findings.md
wave-decisions.md
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
nw-research
Gathers knowledge from web and files, cross-references across multiple sources, and produces cited research documents. Use when investigating technologies, patterns, or decisions that need evidence backing.
nw-distill
Acceptance test creation methodology for the DISTILL wave. Domain knowledge for the acceptance designer agent: port-to-port principle, prior wave reading, wave-decision reconciliation, graceful degradation, and document back-propagation.
nw-review-output-format
YAML output format and approval criteria for platform design reviews. Load when generating review feedback.
nw-ddd-tactical
Tactical DDD — aggregate design rules, entities, value objects, domain events, repositories, domain services, and anti-pattern detection
nw-infrastructure-and-observability
Infrastructure as Code patterns (Terraform, Kubernetes), observability design (SLOs, metrics, alerting, dashboards), and pipeline security stages. Load when designing infrastructure, observability, or security scanning.
nw-par-critique-dimensions
Platform design review critique dimensions and severity levels. Load when reviewing CI/CD pipelines, infrastructure, deployment strategies, observability, or security designs.
Didn't find tool you were looking for?