Agent skill
visual-regression-testing
Use when implementing UI components, design systems, or responsive layouts - verifies visual correctness through screenshot comparison and DevTools verification; prevents shipping broken UI
Install this agent skill to your Project
npx add-skill https://github.com/bacchus-labs/wrangler/tree/main/skills/frontend/visual-regression-testing
SKILL.md
Frontend Visual Regression Testing
Overview
Visual regression testing captures screenshots of UI components/pages and compares them against baseline images to detect unintended visual changes.
When to use this skill:
- Implementing new UI components
- Modifying existing UI
- Working on design systems
- Implementing responsive layouts
- Refactoring CSS/styling
The Iron Law
NO UI CHANGES WITHOUT VISUAL VERIFICATION
If you changed UI code (HTML, CSS, JSX, templates):
- You MUST take screenshots
- You MUST verify in DevTools
- You MUST compare against baseline (if exists)
- You CANNOT claim "looks good" without evidence
Visual TDD Cycle
Visual regression testing integrates with TDD through TWO sequential cycles:
Phase 1: Component Functionality (Traditional TDD)
RED Phase:
test('checkout form renders with required fields', async ({ mount }) => {
const component = await mount('<checkout-form></checkout-form>');
// Test functionality (TDD RED - this will fail)
await expect(component.locator('[name="cardNumber"]')).toBeVisible();
await expect(component.locator('[name="expiry"]')).toBeVisible();
await expect(component.locator('[name="cvc"]')).toBeVisible();
});
Run test: FAILS (component doesn't exist)
GREEN Phase:
// Implement checkout-form component
// Add cardNumber, expiry, cvc fields
Run test: PASSES (component renders fields)
REFACTOR Phase: Improve component structure, styling, accessibility
Phase 2: Visual Correctness (Visual TDD)
After component functionally works, add visual verification:
test('checkout form visual appearance', async ({ mount, page }) => {
await mount('<checkout-form></checkout-form>');
// Visual regression test
await expect(page.locator('.checkout-form'))
.toHaveScreenshot('checkout-form.png');
});
First run (Baseline Generation):
- No baseline exists
- Test generates baseline screenshot
- Review baseline: Does it look correct?
- Commit baseline to git
RED Phase (Visual Regression): After baseline exists, make CSS change:
/* Change button color from blue to red */
.submit-button { background: red; }
Run test: FAILS (screenshot doesn't match baseline) Review diff: Is change intentional?
GREEN Phase (Update Baseline if Intentional): If red button is intentional:
npm test -- --update-snapshots
New baseline committed Run test: PASSES
If red button is NOT intentional (regression):
/* Revert change */
.submit-button { background: blue; }
Run test: PASSES
Summary: Two TDD Cycles
- Functional TDD (first): Write test for component behavior → Implement → Refactor
- Visual TDD (second): Generate baseline → Make changes → Verify no regressions
Integration:
- Functional tests come FIRST (component must work before visual testing)
- Visual tests come SECOND (component must look right)
- Both follow TDD, but visual baseline generation is a special case
- Baseline generation doesn't violate "watch it fail" - the failure comes when you change CSS and screenshot differs
Cross-reference: See practicing-tdd skill for core RED-GREEN-REFACTOR principles.
Step-by-Step Process
Step 1: Before Implementation
IF baseline exists (modifying existing UI):
- Note current visual state
- Identify what should change
- Identify what should NOT change
IF no baseline (new UI):
- Plan visual appearance
- Prepare to capture initial screenshot
Step 2: During Implementation
Write tests first (TDD):
// Test that component renders
test('checkout form renders correctly', async ({ page }) => {
await mount('<checkout-form></checkout-form>');
// Take screenshot of component
await expect(page.locator('[data-testid="checkout-form"]'))
.toHaveScreenshot('checkout-form.png');
});
Implement component (GREEN phase)
Step 3: Visual Verification (MANDATORY)
3.1: Take Screenshot
Prefer element-level over full-page:
// ✅ GOOD: Element-level (less noise)
await expect(page.locator('.checkout-form'))
.toHaveScreenshot('checkout-form.png');
// ❌ BAD: Full page (too much noise)
await expect(page).toHaveScreenshot('entire-page.png');
3.2: DevTools Verification
BEFORE claiming UI works:
-
Open DevTools Console:
- Press F12 or Cmd+Option+I
- Click "Console" tab
- Refresh page
-
Verify NO errors:
✅ GOOD: Console is empty (or only expected logs) ❌ BAD: Red errors visible ❌ BAD: Yellow warnings visible (unless documented) -
Take Console Screenshot:
- Screenshot showing clean console
- Include in completion evidence
-
Check Network Tab:
- Click "Network" tab
- Refresh page
- Verify expected requests made
- Verify no failed requests (red)
-
Test Responsive Breakpoints:
- Mobile: 375x667 (iPhone SE)
- Tablet: 768x1024 (iPad)
- Desktop: 1920x1080
3.3: Compare Against Baseline
IF baseline exists:
// Test runs, Playwright compares screenshots
// IF different: Test fails with diff image
Review diff image:
- Green pixels: New content
- Red pixels: Removed content
- Yellow pixels: Changed content
Decision tree:
Are differences intentional?
├─ YES → Update baseline, document why
└─ NO → Fix regression, re-run test
IF no baseline:
- First run generates baseline
- Visually review screenshot
- Verify it looks correct
- Commit baseline image to git
Step 4: Baseline Management
Baseline files:
tests/
screenshots/
checkout-form.png ← Baseline
checkout-form-diff.png ← Diff (if different)
checkout-form-actual.png ← Actual (if different)
When to update baseline:
- ✅ Intentional UI changes
- ✅ Design system updates
- ✅ After reviewing and approving diff
- ❌ NEVER: To make test pass without reviewing
- ❌ NEVER: Because "it looks fine to me"
Updating baseline:
# Review diff first!
# If intentional, update baseline:
npm test -- --update-snapshots
# Or Playwright specific:
npx playwright test --update-snapshots
Framework-Agnostic Patterns
Playwright (Recommended)
test('visual regression', async ({ page }) => {
await page.goto('/checkout');
// Element-level screenshot
await expect(page.locator('.checkout-form'))
.toHaveScreenshot('checkout-form.png', {
maxDiffPixels: 100, // Allow minor differences
});
});
Puppeteer
test('visual regression', async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('http://localhost:3000/checkout');
const element = await page.$('.checkout-form');
await element.screenshot({ path: 'checkout-form.png' });
// Compare manually or use Percy/Chromatic
await browser.close();
});
Cloud Solutions (Optional)
- Chromatic: Cloud visual testing with Storybook
- Percy: Cross-browser screenshot comparison
- LambdaTest SmartUI: AI-powered visual testing
When to Use Visual Testing
YES (visual tests appropriate):
- Layout changes detection
- CSS regression prevention
- Cross-browser rendering verification
- Design system component verification
- Responsive design validation
NO (use other test types):
- Dynamic content (timestamps, random data)
- Third-party widgets (ads, analytics)
- Content that changes frequently
- Animations mid-transition (unless testing specific frame)
Configuration Best Practices
// playwright.config.ts
export default defineConfig({
expect: {
toHaveScreenshot: {
maxDiffPixels: 100, // Allow minor rendering differences
threshold: 0.2, // 20% threshold for pixel differences
animations: 'disabled', // Disable animations for stability
},
},
});
Mandatory Verification Checklist
BEFORE claiming UI work complete:
Visual Verification
- Screenshot taken for all changed UI elements
- Screenshot compared against baseline (if exists)
- Differences reviewed and determined intentional/regression
- Baseline updated if changes intentional
DevTools Verification
- DevTools Console opened
- Console shows NO errors (0 red messages)
- Console shows NO warnings (or warnings documented)
- Console screenshot taken and included in evidence
Network Verification
- DevTools Network tab opened
- Expected API calls made
- No failed requests (no red in network tab)
- Response data correct
Responsive Verification
- Tested mobile breakpoint (375x667)
- Tested tablet breakpoint (768x1024)
- Tested desktop breakpoint (1920x1080)
If ANY checkbox unchecked: UI work is NOT complete.
Evidence Requirements
When claiming UI work complete, provide:
-
Screenshot evidence:
Screenshot: checkout-form.png (baseline) [Attach screenshot] Changes: Intentional (updated button styling) Baseline updated: YES -
DevTools Console evidence:
Console verification: [Screenshot showing empty console] Errors: 0 Warnings: 0 -
Network evidence (if API calls):
Network verification: [Screenshot showing successful requests] Expected requests: ✓ GET /api/products Failed requests: 0
Red Flags - STOP IMMEDIATELY
If you catch yourself:
- Claiming "looks good" without screenshots
- Skipping DevTools verification
- Updating baseline without reviewing diff
- Taking full-page screenshots for component changes
- Proceeding with console errors visible
- Not testing responsive breakpoints
THEN:
- STOP immediately
- Complete all verification steps
- This is not optional
Integration with Other Skills
Combines with:
- practicing-tdd: Visual tests follow TDD cycle
- verifying-before-completion: Visual verification required
- frontend-accessibility-verification: Check a11y after visual verification
Common Rationalizations
| Rationalization | Counter |
|---|---|
| "I can see it looks good" | Your eyes aren't regression tests. Take screenshot. |
| "It's a small change" | Small changes cause visual regressions. Screenshot required. |
| "I'll check it in the browser" | Browser check ≠ automated verification. Take screenshot. |
| "Console errors don't affect appearance" | Errors indicate bugs. Fix before claiming complete. |
| "Full page screenshot is easier" | Element screenshots catch actual changes. Be specific. |
Example Session
Agent: "I'm implementing a checkout form component."
[Uses frontend-visual-regression-testing skill]
1. Write test expecting checkout form renders
2. Take screenshot of component → baseline
3. Run test → PASS (baseline generated)
4. Refactor CSS for better spacing
5. Run test → FAIL (screenshot different)
6. Review diff → Intentional (better spacing)
7. Update baseline
8. Open DevTools console → 0 errors
9. Take console screenshot
10. Test responsive breakpoints → All look correct
11. Provide evidence in completion message:
- checkout-form.png (baseline)
- Console screenshot (0 errors)
- Responsive screenshots (mobile/tablet/desktop)
"Checkout form complete. Visual regression test passing.
Console clean. Responsive breakpoints verified."
References
- Playwright screenshot comparison: https://playwright.dev/docs/test-snapshots
- Testing Library philosophy: Test user-visible behavior
- Modern frontend testing (2024-2025 practices)
Remember: NO UI CHANGES WITHOUT VISUAL VERIFICATION. Screenshots are evidence, not optional.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
locating-code
Finds specific code elements (functions, classes, patterns) using multiple search strategies. Use when searching for implementations, dependencies, or code requiring modification.
using-wrangler
Use when starting any conversation - establishes mandatory workflows for finding and using skills, including using Skill tool before announcing usage, following brainstorming before coding, and creating TodoWrite todos for checklists
creating-issues
For use when a new issue/task has been identified and needs to be formally captured using the Wrangler MCP issue management system. Use this skill to create new issues via the issues_create MCP tool with appropriate metadata and structured content.
validating-roadmaps
Validates roadmap completeness, phase coherence, and alignment with constitution. Use when creating roadmaps, reviewing planning documents, or ensuring strategic consistency.
refreshing-metrics
Auto-updates status metrics across governance documents from MCP issue counts. Use when governance metrics are stale or after significant issue status changes requiring documentation refresh.
updating-git-hooks
Updates existing git hook configurations for new requirements or tool changes. Use when hook requirements change, adding new quality checks, or modifying test commands.
Didn't find tool you were looking for?