Agent skill

qa-test

Browser-based QA verification after any implementation. Use when someone says "QA this", "test this in browser", "verify the feature", "qa test", "browser test", or after completing an /implement-change to verify acceptance criteria in a real browser. Opens Chrome via MCP, exercises each acceptance criterion, verifies via DOM snapshots, and reports pass/fail. The "closer" for every implementation — proof it works, not just that tests pass.

View SKILL.md on GitHub Repository

Stars 3

Forks 1

Install this agent skill to your Project

npx add-skill https://github.com/teambrilliant/dev-skills/tree/main/skills/qa-test

SKILL.md

QA Test

Verify implemented features in a real browser. Exercise each acceptance criterion, verify via snapshots, report results.

Context-efficient design: browser testing runs in a sub-agent so snapshot/interaction data stays out of the main thread. Main thread only sees compact pass/fail summaries.

Process

Pre-flight (sub-agent) — gather criteria, resolve URL, check environment
Interactive setup — human steers browser for hard-to-automate steps (login, drag, etc.)
Browser testing (sub-agent) — exercises all criteria in isolated context
Report results — main thread receives compact summary only
Handle failures — retry failed criteria after manual intervention if needed

1. Pre-flight Sub-agent

Launch an Explore sub-agent before any browser interaction to gather all context in parallel.

Sub-agent prompt:

Gather QA pre-flight context for testing. Return a structured JSON block with:

1. **acceptance_criteria**: List of testable criteria. Check these sources in order,
   stop at the first that has criteria:
   - The user's prompt (if criteria were given explicitly)
   - Current PR description: run `gh pr view --json body` via Bash
   - Current branch diff: run `git diff main...HEAD --stat` then read changed files
     to infer what user-visible behavior changed
   - Linked issue: check PR body for issue references, fetch with `gh issue view`

2. **test_url**: Where to test. Check in order:
   - `.tap/tap-audit.md` Environments section
   - `package.json` scripts for `dev`, `start`, or similar
   - Common defaults: localhost:3000, :5173, :4321, :6886

3. **app_running**: Try to fetch the test_url via `curl -s -o /dev/null -w '%{http_code}'`.
   Return the status code. If not running, return the dev command that would start it.

4. **test_pages**: List of specific page URLs/routes to visit based on the changed files
   (e.g., if `modules/campaigns/` changed, the test page is likely `/campaigns/...`)

5. **db_available**: Check if postgres MCP tools are available (search for
   `mcp__postgres__execute_sql` or similar). Return true/false.

6. **has_async_flows**: Based on the changed files, flag whether the feature involves
   background jobs (Temporal workflows, queues, webhooks) that need async verification.

7. **needs_login**: Whether the app requires authentication. Check for login pages,
   auth middleware, or session requirements in the codebase.

Return results as a structured summary, not raw tool output.

Using the pre-flight results:

If app_running is not 200, start the dev server (background) and wait for it
Use acceptance_criteria as the test plan
Use test_pages to know where to navigate first
If db_available, include database verification steps
If has_async_flows, use the async testing pattern
If needs_login, prompt user for interactive setup before launching browser sub-agent

Fallback (no sub-agent): If sub-agents are unavailable, gather criteria and resolve URL sequentially.

Gather acceptance criteria from (in priority order):

Explicit criteria provided in the prompt
Current ticket/issue (if referenced)
PR description
.tap/tap-audit.md for environment context

If no criteria found, ask in human mode. In agent mode, infer from the diff.

Resolve test URL (in priority order):

URL provided in the prompt
.tap/tap-audit.md → Environments section
package.json scripts → dev, start, or similar
Common defaults: http://localhost:3000, http://localhost:5173, http://localhost:4321

Verify the app is running before proceeding.

2. Interactive Setup (Main Thread)

Before launching the browser testing sub-agent, handle anything that's hard to automate in the main thread. The browser state persists since the sub-agent connects to the same Chrome instance.

When to prompt for interactive setup:

needs_login is true → ask user: "App requires login. Want me to navigate to login page so you can sign in, or should I attempt automated login?"
Complex drag-and-drop or gesture-based preconditions
Multi-factor auth, CAPTCHAs, OAuth popups

What to do:

Navigate to the relevant page via Chrome MCP
Tell the user what action is needed
Wait for user confirmation that setup is complete
Then launch the browser testing sub-agent

If no interactive setup is needed, skip directly to step 3.

3. Browser Testing (Sub-agent)

Launch a general-purpose sub-agent for all browser interaction. This keeps snapshot/interaction data out of the main thread context.

Sub-agent prompt template:

You are running browser-based QA tests. The browser is already open and may already
be logged in / set up.

Test URL: {test_url}
Acceptance criteria to verify:
{numbered list of criteria}

Additional context:
- DB tools available: {db_available}
- Has async flows: {has_async_flows}
- Test pages: {test_pages}

## How to test

Use Chrome MCP tools (`mcp__chrome-devtools__*`).

**Snapshot-first workflow** — use `take_snapshot` for BOTH finding elements AND
verifying results. Do NOT use `take_screenshot` unless a criterion fails and you
need visual debugging evidence.

**For each criterion:**
1. Navigate to the relevant page
2. `take_snapshot` → get element UIDs and current state
3. Interact via UIDs (`click`, `fill`, `hover`)
4. `take_snapshot` → verify state changed as expected
5. Check `list_console_messages` for errors
6. Check `list_network_requests` for failed requests (4xx, 5xx)

**Important**: UIDs are ephemeral — always take a fresh snapshot before interacting.

**On failure only**: `take_screenshot` and save to `./qa-evidence/` for debugging.

**React/SPA hover interactions:**
Chrome DevTools `hover` only triggers CSS `:hover`, NOT JS `mouseenter`/`mouseover`.
If a UI element only appears via React's `onMouseEnter`:
1. Try `click` directly in the area
2. If that fails, `evaluate_script` to dispatch mouseenter event
3. `take_snapshot` to confirm

**Testing patterns:**
- Form submission: fill → submit → snapshot to verify success + check no errors
- Navigation: click → snapshot to verify new state + check URL
- State changes: trigger action → snapshot to verify → reload → snapshot to verify persistence
- Async: trigger → snapshot for intermediate state → poll snapshots → verify final state
- Error states: trigger invalid input → snapshot to verify error messaging

**Always check:**
- Console errors (JS exceptions)
- Failed network requests (4xx, 5xx)

## Report format

Return ONLY a compact summary in this exact format:

RESULT: [PASS / FAIL / PARTIAL]

CRITERIA:
1. [criterion] — PASS/FAIL — [one-line observation]
2. [criterion] — PASS/FAIL — [one-line observation]
...

ERRORS: [any console errors or failed network requests, or "none"]

FAILURES: [for any failed criterion: what happened, what was expected,
screenshot path if captured]

NEEDS_MANUAL: [any criteria that couldn't be tested due to automation
limitations — e.g., drag-and-drop, complex gestures]

4. Report Results

The sub-agent returns a compact summary. Present it to the user.

Human mode: Show the summary. If any failures, ask: "Want me to fix this and re-test, or is this expected?"

Agent mode: If all pass, proceed (e.g., open PR). If any fail, attempt fix-and-retest.

5. Failure Handling

Automation failures (NEEDS_MANUAL):

The user performs the manual action in the browser (main thread)
Launch a new sub-agent to verify only the remaining criteria
The new sub-agent picks up the browser state left by the user

Code failures (FAIL):

Agent mode:

Fix the code
Launch new sub-agent to re-test only failed criteria
Max 2 fix-and-retest cycles

Human mode:

Present failures
Ask: "Want me to fix this and re-test, or is this expected behavior?"

Optional: Database Verification

Include in the sub-agent prompt when db_available is true. For features that create or modify data:

Record creation: Verify expected rows exist with correct values
Relational data: Confirm junction table rows were created
Status transitions: Confirm async workflows completed

Boundaries

Does NOT write unit tests (that's implement-acceptance-tests)
Does NOT review code quality (that's CLAUDE.md / code review)
Does NOT assess blast radius (that's /blast-radius)
Tests user-visible behavior in the browser, with optional database verification
Does NOT modify acceptance criteria — tests what was specified

Maintainer

teambrilliant Core maintainer

Source details

Full Name: teambrilliant/dev-skills
Branch: main
Path in repo: skills/qa-test
Topics: claude-code claude-code-skills developer-workflow qa-automation ai-dev-tools

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

teambrilliant/dev-skills

product-discovery

Validate whether a product idea is worth building before committing engineering investment. Use when someone says "should we build this", "validate this idea", "discovery", "run an experiment", "test this hypothesis", "what are the risks", "is this worth building", "feasibility check", "prototype plan", or when a team has a shaped feature or product idea and needs to assess risks and design experiments before building. Sits between product-thinker (should we?) and shaping-work (what exactly?) — this skill answers "will this actually work?" by identifying what you don't know, designing the cheapest way to find out, and defining evidence gates that justify (or kill) the investment. Also trigger when someone has a feature request and you sense high uncertainty — if the team is about to spend weeks building something nobody tested, this skill should intervene.

3 1

Explore

teambrilliant/dev-skills

implementation-planning

Create technical implementation plans and architecture designs. Use when someone needs a detailed technical approach before coding begins — "create a plan", "plan this ticket", "how should we implement this", "technical design", "architect this", "design the approach", "plan the migration", "refactor plan", "how should we structure this", or when shaped work or a groomed ticket needs a concrete implementation strategy with phases, file changes, and verification steps.

3 1

Explore

teambrilliant/dev-skills

shaping-work

Shape rough ideas into clear, actionable work definitions. Use this skill whenever someone has an unstructured idea that needs to become a concrete work definition — feature requests, bug reports, PRDs, customer feedback, Slack threads, stakeholder asks, or vague "we should do X" statements. Trigger phrases include "shape this", "scope this", "write a PRD", "define this work", "turn this into a ticket", "flesh this out", "spec this out", "what should we build for X", "I have an idea for...", or any rough input that needs structure before implementation can begin.

3 1

Explore

teambrilliant/dev-skills

implement-change

Execute code changes from an implementation plan. Use when someone says "implement this", "build this", "code this", "start building", "let's implement", "execute the plan", "make the changes", "do the work", or has an approved implementation plan ready for coding. Takes implementation plans and produces working code, phase by phase with verification.

3 1

Explore

teambrilliant/dev-skills

product-primitives

Break down complex products, features, or systems into fundamental primitives and building blocks from a software creator's perspective. Use when starting a new application, designing a large feature, or needing to understand a complex system's moving parts before building. Trigger phrases: "break down X", "decompose this", "what are the primitives", "building blocks of Z", "map the architecture", "what are the moving parts", "analyze this system", or any situation where you need to identify the atomic, reusable capabilities that compose a system. Complements product-thinker (user perspective) with the builder's perspective (system-level connections).

3 1

Explore

teambrilliant/dev-skills

loop-check

Assess what's needed to make feedback loops autonomous in a repo. Use when someone says "loop check", "what do I need to work autonomously", "check my feedback loops", "what's manual here", "what should I automate", "can an agent iterate here", or before starting work in an unfamiliar repo to understand what's missing for autonomous iteration. Also use when the user asks "what do you need to make this autonomous?" or describes a workflow they want to close the loop on. NOT for: full repo audits (use tap-audit), coding, test writing, or implementation.

3 1

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

QA Test

Process

1. Pre-flight Sub-agent

2. Interactive Setup (Main Thread)

3. Browser Testing (Sub-agent)

4. Report Results

5. Failure Handling

Optional: Database Verification

Boundaries

Recommended Agent Skills

product-discovery

implementation-planning

shaping-work

implement-change

product-primitives

loop-check