Agent skills
hypothesis-debugging

Agent skill

hypothesis-debugging

Structured code debugging through hypothesis formation and falsification planning. Use when diagnosing bugs, unexpected behaviour, or system failures where the root cause is unclear. Produces a hypothesis document for execution by another agent rather than performing the investigation directly. Triggers on requests to debug issues, diagnose problems, investigate failures, or create debugging plans.

View SKILL.md on GitHub Repository

Stars 1

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/leynos/agent-helper-scripts/tree/main/skills/hypothesis-debugging

SKILL.md

Hypothesis-Driven Debugging

Generate a structured debugging document that identifies candidate root causes and provides falsification plans for each. The output document instructs a separate execution agent; do not perform the investigation yourself.

Philosophical Foundation

Apply Popperian falsificationism: hypotheses cannot be proven true, only disproven. Design tests that could definitively rule out each hypothesis rather than confirm it. A good falsification test produces a clear negative result if the hypothesis is wrong.

Process

1. Gather Context

Before forming hypotheses, collect:

Symptom description: What behaviour is observed vs expected?
Reproduction conditions: When does it occur? Intermittent or consistent?
Recent changes: Deployments, configuration changes, dependency updates
Error artefacts: Stack traces, logs, error messages, screenshots
Environmental factors: OS, runtime versions, network conditions

If information is missing, note gaps in the output document.

2. Form Hypotheses

Generate 1–5 hypotheses ranked by plausibility. Each hypothesis must be:

Specific: Name the component, function, or interaction suspected
Falsifiable: A concrete test could disprove it
Independent: Falsifying one should not automatically falsify others

Common hypothesis categories:

Category	Examples
State	Race condition, stale cache, corrupted data
Input	Malformed payload, encoding issue, boundary case
Environment	Missing dependency, version mismatch, resource exhaustion
Logic	Off-by-one, incorrect predicate, missing null check
Integration	API contract violation, timeout, auth failure

Avoid vague hypotheses ("something wrong with the database"). Pin down the specific failure mode.

3. Design Falsification Plans

For each hypothesis, specify:

Prediction: If this hypothesis is correct, what observable outcome follows?
Falsification test: What action would produce a contradicting observation?
Expected negative result: What outcome would disprove the hypothesis?
Tooling required: Commands, scripts, or instrumentation needed
Confidence impact: How decisively would a negative result rule this out?

Prefer tests that are:

Quick to execute
Minimally invasive
Deterministic rather than probabilistic

4. Output Document

Generate a Markdown document following the template in assets/debugging-plan.md. Save to the working directory as debugging-plan-{timestamp}.md.

Quality Criteria

A well-formed debugging plan exhibits:

Mutual exclusivity: At least one hypothesis should survive if others fail
Collective exhaustiveness: Hypotheses cover the likely failure space
Ordered efficiency: Cheapest decisive tests appear first
Clear success criteria: The executing agent knows when to stop

Anti-Patterns

Confirmation bias: Designing tests that can only succeed, not fail
Hypothesis creep: Adding new hypotheses during execution rather than revision
Coupling: Tests that cannot isolate individual hypotheses
Vagueness: "Check the logs" without specifying what pattern would falsify

References

references/examples.md: Worked examples of hypothesis-falsification pairs across common debugging scenarios (API timeouts, flaky tests, memory leaks)

Maintainer

leynos Core maintainer

Source details

Full Name: leynos/agent-helper-scripts
Branch: main
Path in repo: skills/hypothesis-debugging

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

leynos/agent-helper-scripts

logisphere-design-review

Pre-implementation design review framework using the df12 Logisphere crew. Stress-tests system designs, RFCs, ADRs, API proposals, data models, and architecture decisions before code gets written. Each expert examines the design through their specialist lens — structural integrity (Pandalump), alternative approaches (Wafflecat), scaling characteristics (Buzzy Bee), contract design (Telefono), failure modes (Doggylump), and long-term viability (Dinolump). Includes a structured pre-mortem and alternatives checkpoint. Use this skill when asked to review a design document, RFC, ADR, system proposal, API design, or architecture decision — or when asked "should we build it this way", "what could go wrong", "design review", "pre-mortem", "architecture review", "RFC review", or any request for pre-implementation feedback.

1 0

Explore

leynos/agent-helper-scripts

implementation-postmortem

Conduct structured implementation postmortems to gather feedback on architecture conformance, library friction, and tooling effectiveness. Use when reviewing completed implementations, PRs, or development phases to surface design gaps, boundary violations, and improvement opportunities. Triggers on requests for code review feedback, implementation retrospectives, architecture audits, or library/tooling evaluations.

1 0

Explore

leynos/agent-helper-scripts

biome-typescript

Configure and use Biome (biomejs) for TypeScript linting and formatting. Use when setting up Biome in a project, configuring lint rules, migrating from ESLint/Prettier, fixing lint errors, setting up CI pipelines with Biome, or configuring git hooks for code quality. Covers biome.json configuration, file inclusion/exclusion patterns, rule overrides, and integration with build tooling.

1 0

Explore

leynos/agent-helper-scripts

code-review

Conduct thorough, actionable code reviews that catch real problems without drowning in noise

1 0

Explore

leynos/agent-helper-scripts

execplans

Write and maintain self-contained ExecPlans (execution plans) that a novice can follow end-to-end; use when planning or implementing non-trivial repo changes.

1 0

Explore

leynos/agent-helper-scripts

leta

Fast semantic code navigation via LSP. Load FIRST before ANY code task - even 'simple' ones. Trigger scenarios: (1) fixing lint/type/pyright/mypy warnings or errors, (2) fixing reportAny/reportUnknownType/Any type errors, (3) adding type annotations, (4) refactoring or modifying code, (5) finding where a function/class/symbol is defined, (6) finding where a symbol is used/referenced/imported, (7) understanding what a function calls or what calls it, (8) exploring unfamiliar code or understanding architecture, (9) renaming symbols across codebase, (10) finding interface/protocol implementations, (11) ANY task where you'd use ripgrep to find code or read-file to view a function. Use `leta show SYMBOL` instead of read-file, `leta refs SYMBOL` instead of ripgrep for usages, `leta grep PATTERN` instead of ripgrep for definitions, `leta files` instead of list-directory.

1 0

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Hypothesis-Driven Debugging

Philosophical Foundation

Process

1. Gather Context

2. Form Hypotheses

3. Design Falsification Plans

4. Output Document

Quality Criteria

Anti-Patterns

References

Recommended Agent Skills

logisphere-design-review

implementation-postmortem

biome-typescript

code-review

execplans

leta