Agent skill
skillshare-cli-e2e-test
Run isolated E2E tests in devcontainer from ai_docs/tests runbooks. Use this skill whenever the user asks to: run an E2E test, execute a test runbook, validate a feature end-to-end, create a new runbook, or test CLI behavior in isolation. If you need to run a multi-step CLI validation sequence (init → install → sync → verify), this is the skill — it handles ssenv isolation, flag verification, and structured reporting. Prefer this over ad-hoc docker exec sequences for any test that follows a runbook or needs reproducible isolation.
Install this agent skill to your Project
npx add-skill https://github.com/runkids/skillshare/tree/main/.skillshare/skills/skillshare-cli-e2e-test
Metadata
Additional technical details for this skill
- targets
-
[ "claude", "universal" ]
SKILL.md
Run isolated E2E tests in devcontainer. $ARGUMENTS specifies runbook name or "new".
Flow
Phase 0: Environment Check
-
Confirm devcontainer is running and get container ID:
bashCONTAINER=$(docker compose -f .devcontainer/docker-compose.yml ps -q skillshare-devcontainer)- If empty → prompt user:
docker compose -f .devcontainer/docker-compose.yml up -d - Ensure
CONTAINERis set for all subsequentdocker execcalls.
- If empty → prompt user:
-
Confirm Linux binary is available:
bashdocker exec $CONTAINER bash -c \ '/workspace/.devcontainer/ensure-skillshare-linux-binary.sh && ss version' -
Confirm mdproof is installed:
bashdocker exec $CONTAINER /workspace/.devcontainer/ensure-mdproof.shThis auto-installs from GitHub release, or falls back to
/workspace/bin/mdproof(local dev binary). -
Check for lessons learned from previous runs:
bashtest -f /workspace/.mdproof/lessons-learned.md && cat /workspace/.mdproof/lessons-learned.mdIf the file exists, read it before writing or debugging runbooks — it contains known gotchas and assertion patterns.
Phase 1: Detect Scope
-
Preview all available runbooks via the container:
bashdocker exec $CONTAINER mdproof --dry-run --report json /workspace/ai_docs/tests/This returns JSON with every runbook's steps, commands, and expected assertions — no manual markdown parsing needed. Use this to understand what each runbook covers.
-
Identify recent changes (unstaged + recent commits):
bashgit diff --name-only HEAD~3 -
Match changes to relevant runbooks (compare changed file paths against step commands in the JSON output).
Phase 2: Select Tests
Prompt user (via AskUserQuestion):
- Option A: Run existing runbook (list all available + mark those related to recent changes)
- Option B: Auto-generate new test script based on recent changes
- Option C: If $ARGUMENTS specifies a runbook, skip to Phase 3
Phase 3: Prepare & Execute
Running existing runbook:
-
Create isolated environment with auto-initialization:
bashENV_NAME="e2e-$(date +%Y%m%d-%H%M%S)" # Use --init to automatically run 'ss init -g' with all targets docker exec $CONTAINER ssenv create "$ENV_NAME" --init -
Execute the entire runbook via mdproof inside the container:
bashdocker exec $CONTAINER env SKILLSHARE_DEV_ALLOW_WORKSPACE_PROJECT=1 \ ssenv enter "$ENV_NAME" -- \ mdproof --report json \ /workspace/ai_docs/tests/<runbook_file>.mdmdproof executes each step (
bash -c <command>) in the ssenv-isolated HOME, then returns structured JSON:json{ "version": "1", "runbook": "<runbook_file>.md", "duration_ms": 12345, "summary": { "total": 7, "passed": 5, "failed": 1, "skipped": 1 }, "steps": [ { "step": { "number": 1, "title": "...", "command": "...", "expected": ["..."] }, "status": "passed", // "passed" | "failed" | "skipped" "exit_code": 0, "stdout": "...", "stderr": "..." } ] } -
Analyze the JSON output:
- All passed → proceed to Phase 4
- Any failed → filter for failures only (full JSON can be too large for terminal output):
bash
mdproof --report json runbook.md 2>&1 | jq '{ summary: .summary, failed: [.steps[] | select(.status == "failed") | { step: .step.number, title: .step.title, exit_code: .exit_code, failed_assertions: [.assertions[]? | select(.matched == false) | .pattern], stderr: (.stderr // "" | .[0:200]) }] }' - Skipped steps (executor=
manual) → these need manual verification, run them individually:bashdocker exec $CONTAINER env SKILLSHARE_DEV_ALLOW_WORKSPACE_PROJECT=1 \ ssenv enter "$ENV_NAME" -- <command from step.command>
-
For failed steps, debug individually using manual docker exec (same as before):
bashdocker exec $CONTAINER env SKILLSHARE_DEV_ALLOW_WORKSPACE_PROJECT=1 \ ssenv enter "$ENV_NAME" -- bash -c '<failed step command>'- Prefer
--json+jqfor assertions — see the JSON Reference below
- Prefer
Generating new runbook:
- Read
git diff HEAD~3to find changed files incmd/skillshare/orinternal/ - Read changed files to understand new/modified functionality
- Validate all CLI flags before writing — for every
ss <command> <flag>in the runbook:- Grep
cmd/skillshare/<command>.gofor the exact flag string (e.g."--force") - Run
ss <command> --helpinside container if needed - Common mistakes to avoid:
uninstall --yes→ wrong, use--force/-finit --target <name>→ wrong,inithas no--targetflaginit -phas a completely separate flag set from globalinit— only supports--targets,--discover,--select,--mode,--dry-run. Global-only flags like--no-copy,--no-skill,--no-git,--all-targets,--forcedo NOT exist in project mode- Audit custom rules: disable by rule ID (e.g.
prompt-injection-0,prompt-injection-1), NOT pattern name (e.g.prompt-injection). Rule IDs are ininternal/audit/rules.yaml
- Grep
- Generate new runbook to
ai_docs/tests/<slug>_runbook.md, following existing conventions:- YAML-free, pure Markdown
- Has Scope, Environment, Steps (each with bash + Expected), Pass Criteria
- Use
jq:assertions in Expected blocks for JSON commands — e.g.- jq: .extras | length == 1. This is a native mdproof assertion type, NOT a bashjqpipe - Use
--json+jq -ein bash for inline verification within multi-command steps - Config idempotency — never bare
cat >> config.yaml; always prependsed -i '/^section:/,$d'to remove existing section first, or use CLI commands (ss extras init,ss extras remove --force) that handle duplicates - Check
ai_docs/tests/runbook.jsonfor project-level config (build, setup, teardown, step_setup, timeout) that affects all runbooks - Check
.mdproof/lessons-learned.mdfor known assertion patterns and gotchas
- Run the runbook quality checklist (see below) before executing
- Then execute the new runbook (same flow as above)
Phase 4: Cleanup & Report
-
Ask user before cleanup (via AskUserQuestion):
- Option A: Delete ssenv environment now
- Option B: Keep for manual debugging (print env name for later
ssenv delete)
-
If user chose Option A:
bashdocker exec $CONTAINER ssenv delete "$ENV_NAME" --force -
Output summary (derived from the runbook JSON output):
── E2E Test Report ── Runbook: {runbook name} Env: {ENV_NAME} Duration: {duration_ms}ms Step 1: {title} PASS Step 2: {title} PASS Step 3: {title} FAIL ← exit_code={N}, stderr: {error detail} ... Result: {passed}/{total} passed ({skipped} skipped)All values come directly from mdproof's JSON output —
summary.passed,summary.total,steps[].step.title,steps[].status. -
If any FAIL → distinguish between runbook bug vs real bug:
- Runbook bug: wrong flag, wrong file path, stale assertion → fix runbook, re-run step
- Real bug: CLI misbehavior → analyze cause, provide fix suggestions
-
Retrospective — ask user (via AskUserQuestion):
Did you encounter any friction during this test run that the skill or runbook could handle better?
- Option A: Yes, improve e2e skill — review test friction (wrong flags, stale assertions, missing checklist items, unclear instructions), then update SKILL.md and/or runbooks
- Option B: Yes, but only fix the runbook — fix the specific runbook without changing the skill itself
- Option C: No, skip
Improvement targets:
- SKILL.md: add new checklist items, common-mistake examples, or rule clarifications learned from this run
- Runbooks: fix stale assertions (e.g. config.yaml → registry.yaml), wrong flags, outdated paths
- Both: when a systemic issue (e.g. a refactor changed file locations) affects both the skill's guidance and existing runbooks
Runbook Quality Checklist
Before executing a newly generated runbook, verify:
- All CLI flags exist — every
ss <cmd> --flagwas grep-verified against source -
--initinteraction — if runbook hasss init, account forssenv create --initalready initializing (add--forceto re-init, or skip init step) -
--initcreates default extras —ssenv create --initcreates arulesextra by default. Runbooks that assume an empty extras list must add cleanup first:ss extras remove rules --force -g 2>/dev/null || true+rm -rf ~/.claude/rules - Correct confirmation flags —
uninstalluses--force(not--yes);initre-run needs no flag (just fails gracefully) - Skill data in registry.yaml — assertions about installed skills check
registry.yaml, NOTconfig.yaml; config.yaml should never containskills: - File existence timing —
registry.yamlis only created after first install/reconcile, not onss init - Project mode paths — project commands use
.skillshare/not~/.config/skillshare/ - Project init flags —
init -ponly supports--targets,--discover,--select,--mode,--dry-run; global-only flags (--no-copy,--no-skill,--no-git,--all-targets,--force) are not available - Audit rule IDs — custom rules in
audit-rules.yamluse rule IDs (e.g.prompt-injection-0), not pattern names (e.g.prompt-injection). Verify IDs againstinternal/audit/rules.yaml - Use
--jsonfor assertions — if the command supports--json, use it withjqinstead of grepping human-readable output. Text output changes between versions; JSON structure is stable - Expected = actual substrings, NOT descriptions — the runbook assertion engine does case-insensitive substring matching. Write
- Installedor- cangjie-docs-navigator, NOT- Install completes without erroror- Output contains at least one skill. Negation: useNot <substring>prefix (e.g.- Not cangjie-docs-navigator) - Skill name ≠ repo name — after
ss install <repo>, the actual skill name may differ from the repo name (e.g. repocangjie-docs-mcp→ skillcangjie-docs-navigator). Always verify the installed skill name viass listbefore writing uninstall/check steps -
/tmp/cleanup — ssenv only isolates$HOME;/tmp/is shared across runs. Any step using/tmp/<path>must start withrm -rf /tmp/<path>to avoid stale state from previous runs -
echo > symlinkwrites through —echo "content" > pathwherepathis a symlink writes to the symlink's target, it does NOT replace the symlink with a real file. To create a local (non-managed) file at a symlinked path: either use a different filename, orrmthe symlink first thenecho -
cat >>is not idempotent — appending to config files (cat >> config.yaml) will duplicate sections on re-run. Preferss extras init(which validates duplicates) or full file replacement overcat >>when possible - Extras source path layout — extras use
~/.config/skillshare/extras/<name>/(not the legacy flat path~/.config/skillshare/<name>/). Symlink assertions must includeextras/in the path regex (e.g.regex: skillshare/extras/rules/tdd\.md) - Prefer
jq:overpython3 -c— for JSON output validation, use mdproof's nativejq:assertion type (e.g.- jq: .extras | length == 1) instead of piping topython3 -c. It's one line vs 10, and mdproof handles failure reporting automatically - Config append idempotency — when appending YAML sections with
cat >>, always prependsed -i '/^section_key:/,$d'to remove existing section. Or prefer CLI commands (ss extras init,ss extras remove --force) over manual config editing - Check lessons-learned — read
.mdproof/lessons-learned.mdbefore writing new runbooks for known gotchas and proven assertion patterns
Runbook Assertion Types
mdproof supports 6 assertion types under Expected: blocks. Use the most specific type for each check:
| Type | Syntax | When to use | Example |
|---|---|---|---|
| Substring | plain text | Simple output check | - hello world |
| Negated | Not/Should NOT prefix |
Verify absence | - Not FAIL |
| Exit code | exit_code: N |
Every step should have this | - exit_code: 0 |
| Regex | regex: prefix |
Pattern matching | - regex: v\d+\.\d+ |
| jq | jq: prefix |
JSON output (preferred) | - jq: .extras | length == 1 |
| Snapshot | snapshot: prefix |
Stable output comparison | - snapshot: api-response |
jq: best practices:
# Simple field check
- jq: .name == "rules"
# Array length
- jq: .extras | length == 3
# Sorted array comparison
- jq: [.extras[].name] | sort | . == ["a","b","c"]
# Null/missing field (omitempty)
- jq: .extras == null
# Nested access
- jq: .[0].targets[0].status == "synced"
# Boolean
- jq: .source_exists == true
Rules
- Always execute inside devcontainer — use
docker exec, never run CLI on host - Always use
ssenvfor HOME isolation — don't pollute container default HOME - Always create fresh ssenv environments — never reuse an environment from a previous run; stale config/state causes confusing cascade failures (e.g. duplicate YAML keys, "already exists" errors)
- ssenv only isolates
$HOME—/tmp/,/var/, and other system paths are shared across all environments. Runbook steps using/tmp/must includerm -rfcleanup at the start - Verify every step — never skip Expected checks
- Don't abort on failure — record FAIL, continue to next step, summarize at end
- Ask before cleanup — Phase 4 must prompt user before deleting ssenv environment
ss=skillshare— same binary in runbooks~= ssenv-isolated HOME —ssenv enterauto-setsHOME- Use
--init— simplify setup by usingssenv create <name> --init --initalready runs init — the env is pre-initialized; runbook steps callingss initagain will fail unless the step explicitly resets state first
ssenv Quick Reference
| Command | Purpose |
|---|---|
sshelp |
Show shortcuts and usage |
ssls |
List isolated environments |
ssnew <name> |
Create + enter isolated shell (interactive) |
ssuse <name> |
Enter existing isolated shell (interactive) |
ssback |
Leave isolated context |
ssenv enter <name> -- <cmd> |
Run single command in isolation (automation) |
- For interactive debugging:
ssnew <env>thenexitwhen done - For deterministic automation: prefer
ssenv enter <env> -- <command>one-liners
Test Command Policy
When running Go tests inside devcontainer (not via runbook):
# ssenv changes HOME, so always cd to /workspace first for Go test commands
cd /workspace
go build -o bin/skillshare ./cmd/skillshare
SKILLSHARE_TEST_BINARY="$PWD/bin/skillshare" go test ./tests/integration -count=1
go test ./...
Always run in devcontainer unless there is a documented exception.
Note: ssenv enter changes HOME, which may affect Go module resolution — always cd /workspace before running go test or go build.
--json Quick Reference
Most commands support --json for structured output, making assertions more reliable than text matching.
| Command | --json |
Notes |
|---|---|---|
ss status |
--json |
Skills, targets, sync status |
ss list |
--json / -j |
All skills with metadata |
ss target list |
--json |
Configured targets |
ss install <src> |
--json |
Implies --force --all (skip prompts) |
ss uninstall <name> |
--json |
Implies --force (skip prompts) |
ss collect <path> |
--json |
Implies --force (skip prompts) |
ss check |
--json |
Update availability per repo |
ss update |
--json |
Update results per skill |
ss diff |
--json |
Per-file diff details |
ss sync |
--json |
Sync stats per target |
ss audit |
--format json |
Also accepts --json (deprecated alias) |
ss log |
--json |
Raw JSONL (one object per line) |
Key behaviors:
--jsonthat implies--force/--allskips interactive prompts — safe for automation- Output goes to stdout only (progress/spinners suppressed)
auditprefers--format json;--jsonstill works but is the deprecated formlog --jsonoutputs JSONL (newline-delimited), not a JSON array
Assertion Patterns with jq
# Count installed skills
ss list --json | jq 'length'
# Check a specific skill exists
ss list --json | jq -e '.[] | select(.name == "my-skill")'
# Verify target is configured
ss target list --json | jq -e '.[] | select(.name == "claude")'
# Assert no critical audit findings
ss audit --format json | jq -e '.summary.critical == 0'
# Check update availability
ss check --json | jq -e '.tracked_repos | length > 0'
# Verify sync succeeded (zero errors)
ss sync --json | jq -e '.errors == 0'
# Install and verify result
ss install https://github.com/user/repo --json | jq -e '.skills | length > 0'
When a jq -e expression fails (exit code 1 = false, 5 = no output), the step FAILs — no ambiguous text matching needed.
Container Command Templates
# Single command
docker exec $CONTAINER ssenv enter "$ENV_NAME" -- ss status
# JSON assertion (preferred for verification)
docker exec $CONTAINER ssenv enter "$ENV_NAME" -- bash -c '
ss list --json | jq -e ".[] | select(.name == \"my-skill\")"
'
# Multi-line compound command (use bash -c) — global mode flags
docker exec $CONTAINER ssenv enter "$ENV_NAME" -- bash -c '
ss init --no-copy --all-targets --no-git --no-skill
ss status
'
# Project mode init (different flag set!)
docker exec $CONTAINER env SKILLSHARE_DEV_ALLOW_WORKSPACE_PROJECT=1 \
ssenv enter "$ENV_NAME" -- bash -c '
cd /tmp/test-project && ss init -p --targets claude
'
# Check files (HOME is set to isolated path by ssenv)
docker exec $CONTAINER ssenv enter "$ENV_NAME" -- bash -c '
cat ~/.config/skillshare/config.yaml
'
# With environment variables
docker exec $CONTAINER ssenv enter "$ENV_NAME" -- bash -c '
TARGET=~/.claude/skills
ls -la "$TARGET"
'
# Go tests (must cd /workspace because ssenv changes HOME)
docker exec $CONTAINER ssenv enter "$ENV_NAME" -- bash -c '
cd /workspace
go test ./internal/install -run TestParseSource -count=1
'
Relationship with /mdproof Skill
This skill (/cli-e2e-test) and the /mdproof skill are complementary, not competing:
| Concern | /cli-e2e-test |
/mdproof |
|---|---|---|
| Scope | Skillshare project-specific E2E | General-purpose runbook authoring |
| Infrastructure | Devcontainer, ssenv, binary build | None — format and assertions only |
| Config | ai_docs/tests/runbook.json (build, setup, teardown) |
Assertion types, snapshot, coverage |
| Lessons | Checklist items, CLI flag gotchas | .mdproof/lessons-learned.md |
| When | Running or debugging a test | Writing or improving a runbook |
How they work together
- Writing a new runbook → invoke
/mdprooffirst for format guidance (assertion types,jq:patterns, snapshot usage), then/cli-e2e-testto execute it in isolation - Improving existing runbooks → invoke
/mdprooffor assertion quality review (python3 → jq:, idempotency), then/cli-e2e-testto verify changes pass - Debugging failures →
/cli-e2e-testPhase 3 step 4 handles manual docker exec;/mdprooflessons-learned captures recurring patterns - After a test run →
/mdproofSelf-Learning section guides recording discoveries to.mdproof/lessons-learned.md
Rule of thumb
- Need to run tests or debug in devcontainer? →
/cli-e2e-test - Need to write assertions or improve runbook quality? →
/mdproof - User says "run extras E2E" →
/cli-e2e-test - User says "improve runbook assertions" →
/mdproofthen/cli-e2e-testto verify
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
skillshare-release
End-to-end release workflow for skillshare. Runs tests, generates changelog (via /changelog), writes RELEASE_NOTES, updates version numbers, commits, and drafts announcements. Use when the user says "release", "prepare release", "cut a release", "release v0.19", or any request to publish a new version. For changelog-only tasks, use /changelog instead.
skillshare-changelog
Generate CHANGELOG.md entry from recent commits in conventional format. Also syncs the website changelog page. Use this skill whenever the user asks to: generate a changelog, document what changed between tags, or create a new CHANGELOG entry. If you see requests like "write the changelog for v0.17", "what changed since last release", this is the skill to use. Do NOT manually edit CHANGELOG.md without this skill — it ensures proper formatting, user-perspective writing, and website changelog sync. For full release workflows (tests, changelog, release notes, version bump, announcements), use /release instead.
skillshare-devcontainer
Run CLI commands, tests, and debugging inside the skillshare devcontainer. Use this skill whenever you need to: execute skillshare CLI commands for verification, run Go tests (unit or integration), reproduce bugs, test new features, start the web UI, or perform any operation that requires a Linux environment. All CLI execution MUST happen inside the devcontainer — never run skillshare commands on the host. If you are about to use Bash to run `ss`, `skillshare`, `go test`, or `make test`, stop and use this skill first to ensure correct container execution.
skillshare-implement-feature
Implement a feature from a spec file or description using TDD workflow. Use this skill whenever the user asks to: add a new CLI command, implement a feature from a spec, build new functionality, add a flag, create a new internal package, or write Go code for skillshare. This skill enforces test-first development, proper handler split conventions, oplog instrumentation, and dual-mode (global/project) patterns. If the request involves writing Go code and tests, use this skill — even if the user doesn't explicitly say "implement".
skillshare-ui-website-style
Skillshare frontend design system for the React dashboard (ui/) and Docusaurus website (website/). Use this skill whenever you: build or modify a dashboard page or component in ui/src/, style or layout website pages or custom CSS in website/, create new React components for the dashboard, add pages to the dashboard, fix visual bugs in either frontend, or need to know which design tokens, components, or patterns to use. This skill covers color tokens, typography, component API, page structure, accessibility, keyboard shortcuts, animations, and anti-patterns for both frontends. Even if the user just says "fix the styling" or "add a card", use this skill to ensure consistency.
skillshare-update-docs
Update website docs to match recent code changes, cross-validating every flag against source. Use this skill whenever the user asks to: update documentation, sync docs with code, document a new flag or command, fix stale docs, or update the README. This skill covers all website/docs/ categories (commands, reference, understand, how-to, troubleshooting, getting-started) plus the built-in skill description and README. If you just implemented a feature and need to update docs, this is the skill to use. Never manually edit website docs without cross-validating flags against Go source first.
Didn't find tool you were looking for?