Agent skills
experiment-designer

Agent skill

experiment-designer

Use when planning product experiments, writing testable hypotheses, estimating sample size, prioritizing tests, or interpreting A/B outcomes with practical statistical rigor.

View SKILL.md on GitHub Repository

Stars 1,878

Forks 294

Install this agent skill to your Project

npx add-skill https://github.com/LeoYeAI/openclaw-master-skills/tree/main/skills/experiment-designer

SKILL.md

Experiment Designer

Design, prioritize, and evaluate product experiments with clear hypotheses and defensible decisions.

When To Use

Use this skill for:

A/B and multivariate experiment planning
Hypothesis writing and success criteria definition
Sample size and minimum detectable effect planning
Experiment prioritization with ICE scoring
Reading statistical output for product decisions

Core Workflow

Write hypothesis in If/Then/Because format

If we change [intervention]
Then [metric] will change by [expected direction/magnitude]
Because [behavioral mechanism]

Define metrics before running test

Primary metric: single decision metric
Guardrail metrics: quality/risk protection
Secondary metrics: diagnostics only

Estimate sample size

Baseline conversion or baseline mean
Minimum detectable effect (MDE)
Significance level (alpha) and power

Use:

bash

python3 scripts/sample_size_calculator.py --baseline-rate 0.12 --mde 0.02 --mde-type absolute

Prioritize experiments with ICE

Impact: potential upside
Confidence: evidence quality
Ease: cost/speed/complexity

ICE Score = (Impact * Confidence * Ease) / 10

Launch with stopping rules

Decide fixed sample size or fixed duration in advance
Avoid repeated peeking without proper method
Monitor guardrails continuously

Interpret results

Statistical significance is not business significance
Compare point estimate + confidence interval to decision threshold
Investigate novelty effects and segment heterogeneity

Hypothesis Quality Checklist

Contains explicit intervention and audience
Specifies measurable metric change
States plausible causal reason
Includes expected minimum effect
Defines failure condition

Common Experiment Pitfalls

Underpowered tests leading to false negatives
Running too many simultaneous changes without isolation
Changing targeting or implementation mid-test
Stopping early on random spikes
Ignoring sample ratio mismatch and instrumentation drift
Declaring success from p-value without effect-size context

Statistical Interpretation Guardrails

p-value < alpha indicates evidence against null, not guaranteed truth.
Confidence interval crossing zero/no-effect means uncertain directional claim.
Wide intervals imply low precision even when significant.
Use practical significance thresholds tied to business impact.

See:

references/experiment-playbook.md
references/statistics-reference.md

Tooling

`scripts/sample_size_calculator.py`

Computes required sample size (per variant and total) from:

baseline rate
MDE (absolute or relative)
significance level (alpha)
statistical power

Example:

bash

python3 scripts/sample_size_calculator.py \
  --baseline-rate 0.10 \
  --mde 0.015 \
  --mde-type absolute \
  --alpha 0.05 \
  --power 0.8

Maintainer

LeoYeAI Core maintainer

Source details

Full Name: LeoYeAI/openclaw-master-skills
Branch: main
Path in repo: skills/experiment-designer
License: MIT License
Topics: skills openclaw ai-agent agentskills myclaw curated skill-collection weekly

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

LeoYeAI/openclaw-master-skills

audit-website

Audit websites for SEO, performance, security, technical, content, and 15 other issue cateories with 230+ rules using the squirrelscan CLI. Returns LLM-optimized reports with health scores, broken links, meta tag analysis, and actionable recommendations. Use to discover and asses website or webapp issues and health.

1,878 294

Explore

LeoYeAI/openclaw-master-skills

firecrawl

Web search and scraping via Firecrawl API. Use when you need to search the web, scrape websites (including JS-heavy pages), crawl entire sites, or extract structured data from web pages. Requires FIRECRAWL_API_KEY environment variable.

1,878 294

Explore

LeoYeAI/openclaw-master-skills

computer-use

Full desktop computer use for headless Linux servers. Xvfb + XFCE virtual desktop with xdotool automation. 17 actions (click, type, scroll, screenshot, drag, etc). Unlike OpenClaw's browser tool, operates at the X11 level so websites cannot detect automation. Includes VNC for live viewing.

1,878 294

Explore

LeoYeAI/openclaw-master-skills

social-media-analyzer

Social media campaign analysis and performance tracking. Calculates engagement rates, ROI, and benchmarks across platforms. Use for analyzing social media performance, calculating engagement rate, measuring campaign ROI, comparing platform metrics, or benchmarking against industry standards.

1,878 294

Explore

LeoYeAI/openclaw-master-skills

business-growth-skills

4 production-ready business and growth skills: customer success manager with health scoring and churn prediction, sales engineer with RFP analysis, revenue operations with pipeline and GTM metrics, and contract & proposal writer. Python tools included (all stdlib-only). Works with Claude Code, Codex CLI, and OpenClaw.

1,878 294

Explore

LeoYeAI/openclaw-master-skills