Agent skill

experiment-designer

Use when planning product experiments, writing testable hypotheses, estimating sample size, prioritizing tests, or interpreting A/B outcomes with practical statistical rigor.

Stars 8,805
Forks 1,070

Install this agent skill to your Project

npx add-skill https://github.com/alirezarezvani/claude-skills/tree/main/product-team/experiment-designer

SKILL.md

Experiment Designer

Design, prioritize, and evaluate product experiments with clear hypotheses and defensible decisions.

When To Use

Use this skill for:

  • A/B and multivariate experiment planning
  • Hypothesis writing and success criteria definition
  • Sample size and minimum detectable effect planning
  • Experiment prioritization with ICE scoring
  • Reading statistical output for product decisions

Core Workflow

  1. Write hypothesis in If/Then/Because format
  • If we change [intervention]
  • Then [metric] will change by [expected direction/magnitude]
  • Because [behavioral mechanism]
  1. Define metrics before running test
  • Primary metric: single decision metric
  • Guardrail metrics: quality/risk protection
  • Secondary metrics: diagnostics only
  1. Estimate sample size
  • Baseline conversion or baseline mean
  • Minimum detectable effect (MDE)
  • Significance level (alpha) and power

Use:

bash
python3 scripts/sample_size_calculator.py --baseline-rate 0.12 --mde 0.02 --mde-type absolute
  1. Prioritize experiments with ICE
  • Impact: potential upside
  • Confidence: evidence quality
  • Ease: cost/speed/complexity

ICE Score = (Impact * Confidence * Ease) / 10

  1. Launch with stopping rules
  • Decide fixed sample size or fixed duration in advance
  • Avoid repeated peeking without proper method
  • Monitor guardrails continuously
  1. Interpret results
  • Statistical significance is not business significance
  • Compare point estimate + confidence interval to decision threshold
  • Investigate novelty effects and segment heterogeneity

Hypothesis Quality Checklist

  • Contains explicit intervention and audience
  • Specifies measurable metric change
  • States plausible causal reason
  • Includes expected minimum effect
  • Defines failure condition

Common Experiment Pitfalls

  • Underpowered tests leading to false negatives
  • Running too many simultaneous changes without isolation
  • Changing targeting or implementation mid-test
  • Stopping early on random spikes
  • Ignoring sample ratio mismatch and instrumentation drift
  • Declaring success from p-value without effect-size context

Statistical Interpretation Guardrails

  • p-value < alpha indicates evidence against null, not guaranteed truth.
  • Confidence interval crossing zero/no-effect means uncertain directional claim.
  • Wide intervals imply low precision even when significant.
  • Use practical significance thresholds tied to business impact.

See:

  • references/experiment-playbook.md
  • references/statistics-reference.md

Tooling

scripts/sample_size_calculator.py

Computes required sample size (per variant and total) from:

  • baseline rate
  • MDE (absolute or relative)
  • significance level (alpha)
  • statistical power

Example:

bash
python3 scripts/sample_size_calculator.py \
  --baseline-rate 0.10 \
  --mde 0.015 \
  --mde-type absolute \
  --alpha 0.05 \
  --power 0.8

Expand your agent's capabilities with these related and highly-rated skills.

alirezarezvani/claude-skills

business-growth-skills

4 business growth agent skills and plugins for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw. Customer success (health scoring, churn), sales engineer (RFP), revenue operations (pipeline, GTM), contract & proposal writer. Python tools (stdlib-only).

8,805 1,070
Explore
alirezarezvani/claude-skills

contract-and-proposal-writer

Contract & Proposal Writer

8,805 1,070
Explore
alirezarezvani/claude-skills

sales-engineer

Analyzes RFP/RFI responses for coverage gaps, builds competitive feature comparison matrices, and plans proof-of-concept (POC) engagements for pre-sales engineering. Use when responding to RFPs, bids, or proposal requests; comparing product features against competitors; planning or scoring a customer POC or sales demo; preparing a technical proposal; or performing win/loss competitor analysis. Handles tasks described as 'RFP response', 'bid response', 'proposal response', 'competitor comparison', 'feature matrix', 'POC planning', 'sales demo prep', or 'pre-sales engineering'.

8,805 1,070
Explore
alirezarezvani/claude-skills

customer-success-manager

Monitors customer health, predicts churn risk, and identifies expansion opportunities using weighted scoring models for SaaS customer success. Use when analyzing customer accounts, reviewing retention metrics, scoring at-risk customers, or when the user mentions churn, customer health scores, upsell opportunities, expansion revenue, retention analysis, or customer analytics. Runs three Python CLI tools to produce deterministic health scores, churn risk tiers, and prioritized expansion recommendations across Enterprise, Mid-Market, and SMB segments.

8,805 1,070
Explore
alirezarezvani/claude-skills

revenue-operations

Analyzes sales pipeline health, revenue forecasting accuracy, and go-to-market efficiency metrics for SaaS revenue optimization. Use when analyzing sales pipeline coverage, forecasting revenue, evaluating go-to-market performance, reviewing sales metrics, assessing pipeline analysis, tracking forecast accuracy with MAPE, calculating GTM efficiency, or measuring sales efficiency and unit economics for SaaS teams.

8,805 1,070
Explore
alirezarezvani/claude-skills

marketing-skills

42 marketing agent skills and plugins for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw, and 6 more coding agents. 7 pods: content, SEO, CRO, channels, growth, intelligence, sales. Foundation context + orchestration router. 27 Python tools (stdlib-only).

8,805 1,070
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results