Agent skill
testing-workflow
Write comprehensive tests following project conventions (tiers, patterns, anti-patterns). Use when writing tests, improving test coverage, fixing failing tests, or reviewing test quality.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/testing-workflow
SKILL.md
Testing Workflow Skill
Quick Decision: Which Test Tier?
Ask yourself:
- Fast local iteration? → Tier 0 (
pytest --tier=0) - Before commit? → Tier 1 (
pytest, default) - Integration validation? → Tier 2 (
pytest --tier=2) - Pre-deployment? → Tier 3 (
pytest --tier=3) - Release validation? → Tier 4 (
pytest --tier=4)
Quick Decision: Which Test Strategy?
For Lambda/infrastructure testing (layers beyond pytest):
- Quick dev iteration? → Unit tests only (
just test-scheduler-unit, 15s) - Before commit? → Quick validation layers 1-5 (
just test-scheduler, 2 min) - Lambda changes? → Docker tests (
just test-scheduler-docker, 90s) - Step Functions changes? → Contract tests (
just test-scheduler-contracts, 10s) - Pre-deployment? → Full validation (
just test-scheduler-all, 5 min) - AWS integration? → Integration tests (
just test-scheduler-integration, 60s)
See Progressive Testing Strategy for the 7-layer approach.
Docker-Based Testing for Lambda Functions
NEW: Docker-based testing prevents "filesystem unaware" deployment failures
For Lambda functions (LINE bot, Telegram API), run tests in Docker to match production runtime:
# LINE bot Docker import validation
./scripts/test_line_bot_docker.sh
# Pre-commit validation (syntax + unit tests + Docker imports)
./scripts/test_line_bot_pre_commit.sh
Why Docker tests matter:
- ✅ Runtime fidelity: Tests run in exact Lambda Python 3.11 environment
- ✅ Filesystem aware: Validates deployment package structure (
/var/task) - ✅ Catches import errors: "cannot import handle_webhook" caught before production
- ✅ 2 birds 1 stone: Tests logic AND validates deployment environment
CI/CD integration:
- GitHub Actions runs Docker import tests automatically (
.github/workflows/deploy-line-dev.yml) - Tests block deployment if imports fail
- Prevents false positive deployments (tests pass but Lambda fails)
Anti-pattern prevented: ❌ Running tests in dev environment (setup-python) but deploying to Lambda (Docker container) ✅ Run tests in Docker container that matches deployed environment
See: .claude/specifications/workflow/2025-12-29-implement-test-workflow-to-reduce-false-positive-deployment.md
Loop Pattern: Synchronize Loop (Test-Code Alignment)
Escalation Trigger:
- Tests pass but code still buggy (drift between test intent and reality)
/validateshows tests don't actually test the claim- Knowledge drift: Test assumptions outdated
Tools Used:
/validate- Verify tests actually test what they claim (sabotage code, test should fail)/consolidate- Align test intent with code reality (update tests or fix code)/trace- Understand test failure causality (why did this test fail?)/reflect- Assess test quality (are we testing outcomes or just execution?)
Why This Works: Testing naturally involves synchronize loop—ensuring tests align with code behavior, not just pass.
See Thinking Process Architecture - Feedback Loops for structural overview.
Test Structure
tests/
├── conftest.py # Shared fixtures ONLY
├── shared/ # Agent, workflow, data tests
├── telegram/ # Telegram API tests
├── line_bot/ # LINE Bot tests (mark: legacy)
├── e2e/ # Playwright browser tests
├── integration/ # External API tests
└── infrastructure/ # S3, DynamoDB tests
When to Use Each Tier
| Tier | Command | Includes | Use Case |
|---|---|---|---|
| 0 | pytest --tier=0 |
Unit only | Fast local |
| 1 | pytest (default) |
Unit + mocked | Deploy gate |
| 2 | pytest --tier=2 |
+ integration | Nightly |
| 3 | pytest --tier=3 |
+ smoke | Pre-deploy |
| 4 | pytest --tier=4 |
+ e2e | Release |
Writing a Test: Checklist
- Choose test location based on component under test
- Use class-based structure:
class TestComponent: - Follow canonical pattern: See PATTERNS.md
- Avoid anti-patterns: Check ANTI-PATTERNS.md
- Apply defensive validation: See DEFENSIVE.md
- Verify test can fail: Sabotage code, test should fail
Common Workflows
Writing a Unit Test
- Create
class TestComponentin appropriate test file - Add
setup_method()if component needs initialization - Write test method:
def test_behavior_description(self): - Use fixtures from conftest.py for shared data
- Assert outcomes, not just execution
- Sabotage code to verify test catches failures
Adding Integration Tests
- Mark with
@pytest.mark.integration - Use real external APIs (LLM, yfinance, Aurora)
- Validate multi-layer outcomes (status code → logs → data state)
- Consider rate limits (
@pytest.mark.ratelimited)
Improving Test Coverage
- Run
pytest --covto see coverage report - Identify untested branches and edge cases
- Write tests for failure modes (not just success)
- Add boundary condition tests
Fixing Failing Tests
- Read test failure message carefully
- Check if code behavior changed (update test)
- Check if test has anti-pattern (fix test)
- Verify test isolation (no shared state between tests)
Test Markers
@pytest.mark.integration # External APIs (LLM, yfinance)
@pytest.mark.smoke # Requires live server
@pytest.mark.e2e # Requires browser
@pytest.mark.legacy # LINE bot (skip in Telegram CI)
@pytest.mark.ratelimited # API rate limited (--run-ratelimited to include)
pytestmark = pytest.mark.legacy # Mark entire file
Quick Reference Commands
# Deploy gate (Tier 1)
just test-deploy
# Integration + Telegram only (Tier 2)
pytest --tier=2 tests/telegram
# Skip LINE bot and browser tests
pytest -m "not legacy and not e2e"
# Include rate-limited tests
pytest --run-ratelimited
# Coverage report
pytest --cov
Rules (DO / DON'T)
| DO | DON'T |
|---|---|
class TestComponent: |
def test_foo() at module level |
assert x == expected |
return True/False (pytest ignores!) |
assert isinstance(r, dict) |
assert r is not None (weak) |
Define mocks in conftest.py |
Duplicate mocks per file |
Patch where USED: @patch('src.api.module.lib') |
Patch where defined: @patch('lib') |
AsyncMock for async methods |
Mock for async (breaks await) |
Next Steps
- For test patterns: See PATTERNS.md
- For anti-patterns to avoid: See ANTI-PATTERNS.md
- For defensive programming: See DEFENSIVE.md
- For progressive testing (7-layer strategy): See PROGRESSIVE-TESTING.md
- For Lambda/Docker/contract testing: See LAMBDA-TESTING.md
Didn't find tool you were looking for?