Agent skill
html-structure-validate
Validate HTML5 structure and basic syntax. BLOCKING quality gate - stops pipeline if validation fails. Ensures deterministic output quality.
Install this agent skill to your Project
npx add-skill https://github.com/aiskillstore/marketplace/tree/main/skills/abejitsu/html-structure-validate
SKILL.md
HTML Structure Validate Skill
Purpose
This skill is a BLOCKING quality gate that ensures generated HTML meets minimum structural requirements. It is the first deterministic validation of probabilistic AI-generated output.
The skill checks:
- HTML5 compliance - Proper DOCTYPE, tags
- Tag closure - All tags properly closed
- Required elements - Meta tags, stylesheet links
- Well-formedness - Valid structure
If validation fails, the pipeline STOPS and triggers a hook to notify the user.
This enforces the principle: Python validates, ensuring deterministic quality.
What to Do
-
Load HTML file to validate
- Read
04_page_XX.htmlgenerated by AI skill - Verify file exists and is readable
- Confirm file is text (not binary)
- Read
-
Run validation checks
- Check HTML5 structure compliance
- Verify tag closure
- Validate head section
- Check required CSS link
- Validate page container structure
-
Generate validation report
- Document all checks performed
- List any errors found
- Note warnings (non-blocking)
- Record informational findings
-
Save validation report as JSON
- Save to:
output/chapter_XX/page_artifacts/page_YY/06_validation_structure.json - Include timestamp
- Include all check results
- Save to:
-
Exit with appropriate code
- Return 0 if VALID (continue pipeline)
- Return 1 if INVALID (STOP pipeline, trigger hook)
Input Parameters
html_file: <str> - Path to 04_page_XX.html
output_dir: <str> - Directory for validation report
strict_mode: <bool> - If true, warnings also fail (default: false)
page_number: <int> - Page number (for reporting)
chapter: <int> - Chapter number (for reporting)
Validation Checks
Check 1: DOCTYPE Declaration
Requirement: File must start with proper DOCTYPE
<!DOCTYPE html>
Check:
- File contains
<!DOCTYPE html>(case-insensitive) - DOCTYPE appears before any tags
- DOCTYPE is on first line or near beginning
Error if: Missing or incorrect DOCTYPE
Check 2: HTML Tags
Requirement: Proper <html> opening and closing tags
<html lang="en">
...
</html>
Checks:
-
<html>tag present -
</html>closing tag present - Tags are properly paired
- No unclosed
<html>tags
Error if: Missing either tag or improperly paired
Check 3: Head Section
Requirement: Complete <head> section with metadata
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>...</title>
<link rel="stylesheet" href="../../styles/main.css">
</head>
Checks:
-
<head>and</head>tags present -
<meta charset="UTF-8">present -
<meta name="viewport">present (warning if missing) -
<title>tag with content present - CSS
<link>tag present with href attribute
Error if: Missing charset, title, or CSS link Warning if: Missing viewport meta tag
Check 4: Body Section
Requirement: Proper <body> tags with content
<body>
<div class="page-container">
<main class="page-content">
...
</main>
</div>
</body>
Checks:
-
<body>and</body>tags present -
<div class="page-container">present -
<main class="page-content">present inside container - Body contains substantial content (> 100 bytes)
Error if: Missing tags or required container divs
Check 5: Tag Closure Validation
Requirement: All tags must be properly closed
Checks for:
- Unmatched opening tags (e.g.,
<p>without</p>) - Improper nesting (e.g.,
<p><h2>text</h2></p>) - Self-closing tags used correctly (e.g.,
<br/>,<img/>) - Comment blocks properly formatted (
<!-- -->)
Validation method:
- Parse HTML into tree structure
- Verify all nodes properly matched
- Check nesting doesn't violate HTML5 rules
Error if: Any unmatched or improperly nested tags
Check 6: Heading Tags (h1-h6)
Requirement: Valid heading hierarchy
<h1>Chapter Title</h1>
<h2>Section Heading</h2>
<h3>Subsection</h3>
Checks:
- All heading tags properly closed
- First heading should be h1 (warning if not)
- Heading levels don't skip dramatically (h1 → h4 is suspicious)
- All headings have text content (not empty)
Error if: Heading tags improperly closed Warning if: Suspicious hierarchy
Check 7: Content Structure
Requirement: Meaningful content in page container
Checks:
-
<main class="page-content">contains elements - Content includes headings or paragraphs
- No completely empty content area
- Text nodes or elements present (> 100 words total)
Error if: No content or empty structure
Check 8: List Integrity
Requirement: All lists properly structured
Checks for each <ul> or <ol>:
- List opening and closing tags matched
- List contains
<li>elements - All
<li>tags properly closed -
<li>count matches opening/closing pairs - No nested
<ul>or<ol>improperly closed
Error if: Empty lists or unmatched <li> tags
Check 9: Image and Link Tags
Requirement: Self-closing tags properly formatted
Checks:
- All
<img>tags havesrcandaltattributes - All
<a>tags have validhrefattributes - Image paths don't have obvious errors (no broken syntax)
- Self-closing tags use proper syntax
Warning if: Images missing alt text or links missing href
Check 10: Table Tags (if present)
Requirement: Proper table structure
Checks:
-
<table>,<tr>,<td>,<th>tags properly nested - All rows have consistent column counts
- Table headers and body properly structured
Error if: Malformed table structure
Validation Report Format
Output: 06_validation_structure.json
{
"page": 16,
"book_page": 17,
"chapter": 2,
"validation_type": "structure",
"validation_timestamp": "2025-11-08T14:34:00Z",
"overall_status": "PASS",
"error_count": 0,
"warning_count": 1,
"checks_performed": [
{
"check_name": "DOCTYPE Declaration",
"status": "PASS",
"details": "Valid HTML5 DOCTYPE found"
},
{
"check_name": "HTML Tags",
"status": "PASS",
"details": "Proper <html> opening and closing tags"
},
{
"check_name": "Head Section",
"status": "PASS",
"details": "All required meta tags and title present"
},
{
"check_name": "Body Section",
"status": "PASS",
"details": "Body and content structure valid"
},
{
"check_name": "Tag Closure",
"status": "PASS",
"details": "All tags properly matched and closed"
},
{
"check_name": "Heading Hierarchy",
"status": "PASS",
"details": "4 headings found, proper h1-h4 hierarchy"
},
{
"check_name": "Content Structure",
"status": "PASS",
"details": "Main content area contains 245 words across 3 paragraphs"
},
{
"check_name": "List Integrity",
"status": "PASS",
"details": "1 list with 3 items, all properly formed"
},
{
"check_name": "Image Tags",
"status": "PASS",
"details": "No images on this page"
},
{
"check_name": "Table Tags",
"status": "PASS",
"details": "No tables on this page"
}
],
"errors": [],
"warnings": [
{
"check": "Heading Hierarchy",
"message": "First heading is h2, typically should be h1 for page opening",
"severity": "LOW"
}
],
"summary": {
"total_checks": 10,
"passed": 9,
"failed": 0,
"warnings": 1,
"html_valid": true,
"tags_matched": true,
"content_substantial": true
}
}
Validation Rules
PASS Criteria
- DOCTYPE present and valid
- All required tags (
html,head,body,main,div.page-container) present - All tags properly closed and matched
- Title tag with content
- CSS stylesheet link present
- Content structure valid
- No structural errors
FAIL Criteria (BLOCKS PIPELINE)
- Missing DOCTYPE
- Missing required tags
- Unmatched or improperly nested tags
- Missing title or CSS link
- Empty content
- Malformed lists or tables
WARNING (Logged but doesn't block)
- Missing viewport meta tag
- First heading is not h1
- Large heading jumps (h1 → h4)
- Missing alt text on images
- Missing href on links
Implementation: Using Python Script
This validation is performed by existing validate_html.py tool, run in structure validation mode:
cd Calypso/tools
# Validate single page HTML
python3 validate_html.py \
../output/chapter_02/page_artifacts/page_16/04_page_16.html \
--output-json ../output/chapter_02/page_artifacts/page_16/06_validation_structure.json \
--strict-structure
# Exit code:
# 0 = VALID (continue to next skill)
# 1 = INVALID (STOP pipeline)
Hook Integration
When validation FAILS:
# Trigger hook: .claude/hooks/validate-structure.sh
# Receives:
# - Page number
# - HTML file path
# - Validation report path
# - Error details
# Hook behavior:
# - Log failure with details
# - Save error report
# - Notify user
# - STOP pipeline (no further processing)
Error Recovery
If validation fails:
- User reviews validation report
- User identifies issue in AI-generated HTML
- Options:
- Fix HTML manually and re-validate
- Re-run AI generation with improved prompt
- Review source extraction data for errors
- Proceed with caution (expert override)
Quality Metrics
Validation provides metrics:
- Percentage of checks passing
- Error severity levels
- Content size (word count, element count)
- Structure complexity
These metrics feed into final quality reports.
Success Criteria
✓ Validation completes successfully ✓ All structural checks pass (0 errors) ✓ Validation report saved in JSON format ✓ Exit code 0 returned (or 1 if invalid) ✓ Clear error messages if validation fails
Next Steps After PASS
If validation passes:
- All pages of chapter processed through this gate
- Skill 4 (consolidate pages) merges individual page HTMLs
- Quality Gate 2 (semantic validate) checks semantic structure
- Continue through validation pipeline
Next Steps After FAIL
If validation fails:
- PIPELINE STOPS
- Hook
validate-structure.shtriggered - User receives error report with details
- User must fix issues and retry
Design Notes
- This is the first deterministic quality gate
- Uses proven
validate_html.pytool - Catches structural issues before semantic analysis
- Provides clear, actionable error messages
- Essential for ensuring pipeline reliability
Testing
To test structure validation:
# Test with known-good HTML
python3 validate_html.py ../output/chapter_01/chapter_01.html
# Should show: ✓ VALID
# Test with invalid HTML (if needed)
python3 validate_html.py broken_html.html
# Should show: ✗ INVALID with specific errors
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
perigon-backend
Perigon ASP.NET Core + EF Core + Aspire conventions
perigon-agent
Pointers for Copilot/agents to apply Perigon conventions
perigon-angular
Angular 21+ standalone/Material/signal conventions for Perigon WebApp
fastapi-mastery
Comprehensive FastAPI development skill covering REST API creation, routing, request/response handling, validation, authentication, database integration, middleware, and deployment. Use when working with FastAPI projects, building APIs, implementing CRUD operations, setting up authentication/authorization, integrating databases (SQL/NoSQL), adding middleware, handling WebSockets, or deploying FastAPI applications. Triggered by requests involving .py files with FastAPI code, API endpoint creation, Pydantic models, or FastAPI-specific features.
context7-efficient
Token-efficient library documentation fetcher using Context7 MCP with 86.8% token savings through intelligent shell pipeline filtering. Fetches code examples, API references, and best practices for JavaScript, Python, Go, Rust, and other libraries. Use when users ask about library documentation, need code examples, want API usage patterns, are learning a new framework, need syntax reference, or troubleshooting with library-specific information. Triggers include questions like "Show me React hooks", "How do I use Prisma", "What's the Next.js routing syntax", or any request for library/framework documentation.
browser-use
Browser automation using Playwright MCP. Navigate websites, fill forms, click elements, take screenshots, and extract data. Use when tasks require web browsing, form submission, web scraping, UI testing, or any browser interaction.
Didn't find tool you were looking for?