Agent skill

test-driven

Test-Driven Development (TDD) - design tests from requirements, then execute RED -> GREEN -> REFACTOR cycle. Use when implementing features or fixes with TDD methodology, writing tests before code, or following XP-style development across any supported language.

Stars 17
Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/OutlineDriven/odin-claude-plugin/tree/main/skills/test-driven

SKILL.md

Test-driven development (XP-style)

Tests define the specification. Design them from requirements before any implementation. The RED-GREEN-REFACTOR cycle is the heartbeat: write a failing test, make it pass with minimal code, then clean up while green.

Modern insight (2025): TDD + property-based testing pairing is the standard -- example tests prevent regressions, property tests discover edge cases. TDD also serves AI-assisted development: structural integrity keeps code understandable for both human and AI collaborators (Kent Beck, "Augmented Coding"). Mutation testing validates test quality beyond coverage metrics (TDD+Mutation: 63.3% vs TDD-alone: 39.4% mutation coverage).

See frameworks for language-specific test runners, property testing, coverage, and mutation tools. See examples for brief TDD cycle patterns per language.


When to Apply

  • New features with clear requirements (both inside-out and outside-in approaches valid)
  • Bug fixes -- write a failing test that proves the bug before fixing
  • Refactoring -- ensure coverage exists before restructuring
  • API contract enforcement -- test the interface, not internals
  • Property-based invariants -- complement example tests with PBT
  • Legacy code -- add characterization tests before modifying (Michael Feathers pattern)

When NOT to Apply

  • Exploratory prototyping or spike research
  • One-off scripts, data migrations, generated code
  • Purely visual UI layout work (prefer visual regression testing)
  • Highly experimental algorithmic research (but PBT still helps)
  • Throwaway code with <1 week lifespan

Anti-patterns

  • Test-last: Writing tests after implementation defeats the design benefit
  • Testing implementation details: Tests should verify behavior, not internal structure -- breaks refactoring confidence
  • Over-mocking: Testing the mocks instead of the code; mock external I/O, not core logic
  • Skipping RED: Tests that never fail aren't tests -- they verify nothing
  • 100% coverage obsession: Coverage does not equal quality. Mutation testing exposes gaps coverage cannot
  • Refactoring on RED: Never restructure with failing tests
  • Test-induced architectural damage: Letting mock boundaries dictate design
  • Snapshot bloat: Approval-style tests without curation become maintenance burden

Two Schools (decision guidance, not prescription)

  • Inside-Out (Classic/Detroit): Start with unit tests for smallest pieces, build upward. Minimizes mocks. Best for well-understood domains, algorithms, utility functions.
  • Outside-In (London/Mockist): Start with acceptance test for user-facing behavior, use mocks to discover interfaces. Best for layered systems, APIs, microservices.
  • Pragmatic teams use both depending on context. Neither is superior.

Test Doubles Hierarchy

  • Stubs: Return predefined data; verify outcomes (state-based)
  • Mocks: Verify interactions/calls were made (behavior-based)
  • Fakes: Working implementations (e.g., in-memory database)
  • Spies: Record calls while using real behavior
  • Rule: Mock external dependencies. Never mock core domain logic.

Workflow (language-neutral)

  1. CREATE -- Write failing tests: error cases -> edge cases -> happy paths -> property tests
  2. RED -- Run tests, verify all fail. If any pass, the test is wrong or behavior already exists.
  3. GREEN -- Minimal code to pass. No extras, no optimization, no cleanup.
  4. REFACTOR -- Clean up while green. Separate structural changes from behavioral (Tidy First). Re-run tests after every change.

Constitutional Rules (Non-Negotiable)

  1. Design Tests First: Plan all test cases from requirements before implementation; write each test iteratively in the RED-GREEN-REFACTOR loop
  2. RED Before GREEN: Each new test MUST fail before you write implementation for it
  3. Error Cases First: Implement error handling before success paths
  4. One Test at a Time: Write one failing test, make it pass, refactor, then add the next test
  5. Refactor Only on GREEN: Never refactor with failing tests

Validation Gates

Gate Pass Criteria Blocking
Tests Created Test files exist for target module Yes
RED State All new tests fail before implementation Yes
GREEN State All tests pass after implementation Yes
Coverage >= 80% line coverage No
Mutation Mutation score reviewed (no threshold enforced) No

Exit Codes

Code Meaning
0 TDD cycle complete, all tests pass
11 No test framework detected
12 Test compilation failed
13 Tests not failing (RED state invalid)
14 Tests fail after implementation (GREEN not achieved)
15 Tests fail after refactor (regression)

Expand your agent's capabilities with these related and highly-rated skills.

OutlineDriven/odin-claude-plugin

refactor-break-bw-compat

Refactor by removing backward compatibility and legacy layers. Use when modernizing APIs, cleaning up migration debt, removing compat shims, or eliminating stale feature flags.

17 0
Explore
OutlineDriven/odin-claude-plugin

pr-merge-temporal

Merge multiple PRs into a temporal integration branch before merging to base, with ordered conflict resolution. Use when you want to validate a set of PRs together on a staging branch before advancing the base branch.

17 0
Explore
OutlineDriven/odin-claude-plugin

tests-adversarial

Write adversarial tests that intentionally stress failure paths. Use when hardening error handling, stress-testing assumptions, validating boundary behavior, or hunting silent failures.

17 0
Explore
OutlineDriven/odin-claude-plugin

srgn-cli

Practical guide for building safe, syntax-aware srgn CLI commands for source-code search and transformation. Use when users ask for srgn commands, scoped refactors (comments/docstrings/imports/functions), multi-file rewrites with --glob, custom tree-sitter query usage, or CI-style checks with --fail-any/--fail-none.

17 0
Explore
OutlineDriven/odin-claude-plugin

askme

Verbalized Sampling (VS) protocol for deep intent exploration before planning. Use when starting ambiguous or complex tasks, when multiple interpretations exist, or when you need to explore diverse intent hypotheses and ask maximum clarifying questions before committing to an approach.

17 0
Explore
OutlineDriven/odin-claude-plugin

pr-merge-base

Merge one or more PRs into the base branch with queue-like sequencing and conflict resolution. Use when merging PRs that may conflict with each other or the base, requiring ordered application and intelligent conflict handling.

17 0
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results