Agent skill
agentic-testing
Autonomously verify code changes by building, launching Script Kit GPUI, sending stdin JSON commands, capturing screenshots, and reading logs. Run after implementing changes to confirm correctness.
Install this agent skill to your Project
npx add-skill https://github.com/johnlindquist/script-kit-next/tree/main/.claude/skills/agentic-testing
SKILL.md
Agentic Testing
Verify code changes by observing the running app. Build, start via named pipe, interact via stdin JSON, capture screenshots, read logs.
When to Use
- After implementing any UI, protocol, or behavior change
- When Oracle's autonomous verification says "Run the agentic-testing skill"
- Before marking a task as complete
- Especially after changes to: prompts, views, keyboard handlers, ACP chat, actions dialog
Safety Rules (MANDATORY)
- NEVER delete files, directories, or data
- NEVER modify databases, user data, or production state
- NEVER run destructive commands (rm -rf, DROP, git push --force, git reset --hard)
- NEVER send requests to external services, APIs, or webhooks
- NEVER modify files outside the project directory
- NEVER commit, push, or modify git history
- ALL verification is read-only: build, launch, screenshot, grep, read logs
- Temp files (pipes, screenshots) go in project
test-screenshots/or/tmp - The app runs locally only — never connect to production
The Pattern
Every verification follows the same core loop:
1. Build
cargo build 2>&1 | tail -5
Must complete with Finished. If it fails, fix the build error first.
2. Launch via Named Pipe
PIPE=$(mktemp -u)
mkfifo "$PIPE"
export SCRIPT_KIT_AI_LOG=1
./target/debug/script-kit-gpui < "$PIPE" > /tmp/sk-test.log 2>&1 &
APP_PID=$!
exec 3>"$PIPE"
sleep 3
The exec 3>"$PIPE" keeps the write end open. Without it, the pipe closes after one write and the app exits.
3. Show the Window
echo '{"type":"show"}' >&3
sleep 1.5
The app starts hidden. Always send show first.
4. Interact
Send commands via fd 3. Common commands:
# Set filter text
echo '{"type":"setFilter","text":"search term"}' >&3
# Discover visible elements (returns semantic IDs)
echo '{"type":"getElements","requestId":"e1"}' >&3
# Select element by semantic ID (from getElements response)
echo '{"type":"batch","requestId":"b1","commands":[{"type":"selectBySemanticId","semanticId":"choice:0:apple","submit":true}]}' >&3
# Trigger a built-in view
echo '{"type":"triggerBuiltin","name":"clipboard"}' >&3
echo '{"type":"triggerBuiltin","name":"tab-ai"}' >&3
echo '{"type":"triggerBuiltin","name":"emoji"}' >&3
echo '{"type":"triggerBuiltin","name":"apps"}' >&3
echo '{"type":"triggerBuiltin","name":"file-search"}' >&3
# Simulate keys (dispatches to current view)
echo '{"type":"simulateKey","key":"enter","modifiers":[]}' >&3
echo '{"type":"simulateKey","key":"escape","modifiers":[]}' >&3
echo '{"type":"simulateKey","key":"k","modifiers":["cmd"]}' >&3
echo '{"type":"simulateKey","key":"w","modifiers":["cmd"]}' >&3
# Type individual characters (for views with text input)
echo '{"type":"simulateKey","key":"h","modifiers":[]}' >&3
5. Capture Screenshots
mkdir -p test-screenshots
echo '{"type":"captureWindow","title":"","path":"'"$(pwd)"'/test-screenshots/step-01.png"}' >&3
sleep 1
titleis substring match.""matches any window.- Path must be absolute — use
$(pwd)/prefix. - Always
sleep 1after capture for file write. - Read the PNG to visually verify. Never assume correctness without checking.
6. Read Logs
grep -i "keyword" /tmp/sk-test.log | head -20
Log format: TIMESTAMP|LEVEL|CATEGORY|cid=CORRELATION_ID message
7. Cleanup
exec 3>&-
rm -f "$PIPE"
kill $APP_PID 2>/dev/null || true
wait $APP_PID 2>/dev/null || true
8. Report
- PASS: build succeeded + expected screenshots match + expected log output
- FAIL: describe what went wrong with evidence (screenshot, log line)
Timing Guidelines
| Action | Sleep after |
|---|---|
| App startup | 3s |
show window |
1.5s |
setFilter |
1s |
triggerBuiltin (opens new view) |
3-5s |
simulateKey (view transition) |
1.5s |
simulateKey (text input) |
0.1s |
captureWindow |
1s |
| ACP context bootstrap | 5-8s |
| ACP response streaming | 10-20s |
Verification Recipes
See references/recipes.md for named verification patterns.
Key Gotchas
simulateKeydoes NOT go through GPUI'sintercept_keystrokes(). UsetriggerBuiltinfor Tab AI, notsimulateKeyTab.AcpChatViewaccepts single-charsimulateKeyfor typing,enterfor submit,w+cmd for close.- The app window auto-hides when focus is lost. If captures fail with "Window not found", the window was dismissed.
captureWindowfilters out windows under 100x100 (tray icons).- Always unset API keys if you need the setup card:
unset ANTHROPIC_API_KEY.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
Generate Component Documentation
Based on existing docs styles and specific API implementations, and referencing same name stories, generate comprehensive documentation for the new component.
Generate Component Story
Generate a comprehensive story for a new component for as example.
new-component
How to write a new component of GPUI Component.
troubleshooting
Diagnose and fix common Script Kit issues. Use when the user reports bugs, crashes, missing features, or unexpected behavior in Script Kit GPUI.
script-authoring
Create and manage TypeScript scripts for Script Kit. Use when the user wants to write a new script, edit an existing script, or understand Script Kit's SDK and metadata system.
agents
Create mdflow-backed agent files for Script Kit. Use when the user wants to create AI agents, configure agent backends (Claude, Gemini, Codex), or manage agent metadata.
Didn't find tool you were looking for?