Agent skill
legislative-flattener
Converts hierarchical legislative text from Word documents into a flat list of requirements. Use when processing regulatory documents, compliance frameworks, or legal text that needs to be extracted into individual, numbered requirements for analysis or mapping.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/legislative-flattener
SKILL.md
Legislative Text Flattener
This skill processes Word documents (DOCX) containing legislative or regulatory text and converts hierarchical structures into a flat, numbered list of discrete requirements or provisions.
When to Use This Skill
Activate this skill when the user wants to:
- Flatten hierarchical legislative or regulatory text
- Extract requirements from compliance frameworks
- Convert nested sections/subsections into a linear list
- Prepare legislative text for requirement mapping
- Process Word documents containing legal or regulatory content
Quick Start
Install required dependencies:
pip install python-docx openpyxl
Basic usage (outputs formatted XLSX by default):
python flattener_utility.py input.docx output.xlsx
Or specify format explicitly:
python flattener_utility.py input.docx output.xlsx --format xlsx
python flattener_utility.py input.docx output.csv --format csv
python flattener_utility.py input.docx output.json --format json
Processing Instructions
1. Input Analysis
First, understand the input document structure:
- Read the DOCX file (use python-docx library or convert to text)
- Identify the hierarchical structure (sections, subsections, paragraphs, sub-paragraphs)
- Detect numbering schemes (1.2.3, (a)(b)(c), Article-Section-Paragraph, etc.)
- Note any special formatting or emphasis (bold, italics, SHALL/MUST keywords)
2. Extraction Process
Extract content while preserving context:
- Each discrete requirement or provision becomes a separate item
- Maintain parent context for nested items
- Preserve the original numbering/reference system
- Extract complete sentences or logical requirement units
- Identify normative language (shall, must, should, may)
3. Flattening Strategy
Convert hierarchy to flat structure:
- Assign sequential flat numbering (1, 2, 3...)
- Include original reference in metadata (e.g., "Section 500.02(b)(3)")
- Preserve full context path for each item
- Handle multi-level lists and nested requirements
- Maintain logical groupings where appropriate
4. Output Format
Generate a structured flat list with these fields for each requirement:
Flat ID: [Sequential number]
Original Reference: [Original section/subsection identifier]
Context Path: [Full hierarchical path, e.g., "Article 5 > Section 2 > Paragraph b"]
Requirement Type: [Mandate/Prohibition/Permission/Definition]
Normative Level: [SHALL/MUST/SHOULD/MAY/INFORMATIVE]
Text: [Full requirement text]
Keywords: [Extracted key concepts or compliance areas]
---
5. Implementation Approach
Use this workflow:
# Install required library if needed
# pip install python-docx
from docx import Document
import re
def flatten_legislative_text(docx_path):
"""
Flattens hierarchical legislative text from DOCX.
Returns a list of flattened requirements.
"""
doc = Document(docx_path)
flattened = []
flat_id = 1
context_stack = []
for paragraph in doc.paragraphs:
# Skip empty paragraphs
if not paragraph.text.strip():
continue
# Detect hierarchy level (by style or numbering)
level = detect_level(paragraph)
original_ref = extract_reference(paragraph)
# Update context stack
update_context(context_stack, level, paragraph.text)
# Extract if it's a requirement (not just a heading)
if is_requirement(paragraph):
item = {
'flat_id': flat_id,
'original_ref': original_ref,
'context_path': ' > '.join(context_stack),
'requirement_type': classify_requirement(paragraph.text),
'normative_level': extract_normative_level(paragraph.text),
'text': clean_text(paragraph.text),
'keywords': extract_keywords(paragraph.text)
}
flattened.append(item)
flat_id += 1
return flattened
6. Output Options
Provide results in user's preferred format:
- XLSX (Excel): Formatted Excel workbook with styled headers, auto-filter, frozen panes, and optimized column widths (default and recommended)
- CSV: Simple tabular format with all fields as columns
- JSON: Structured data for programmatic use
- Markdown: Human-readable with clear section breaks
- Database insert: SQL statements or database-ready format
7. Quality Checks
Validate the flattening:
- Ensure no requirements are lost or duplicated
- Verify context paths are complete and accurate
- Check that normative language is correctly identified
- Confirm original references are preserved
- Review that similar requirements are consistently formatted
Example Output (Markdown Format)
## Flattened Requirements
### Requirement 1
- **Flat ID**: 1
- **Original Reference**: Section 500.02(a)
- **Context Path**: Part 500 > Section 500.02 > Paragraph (a)
- **Requirement Type**: Mandate
- **Normative Level**: SHALL
- **Text**: Each Covered Entity shall maintain a cybersecurity program designed to protect the confidentiality, integrity and availability of the Covered Entity's Information Systems.
- **Keywords**: cybersecurity program, confidentiality, integrity, availability, Information Systems
---
### Requirement 2
- **Flat ID**: 2
- **Original Reference**: Section 500.02(b)
- **Context Path**: Part 500 > Section 500.02 > Paragraph (b)
- **Requirement Type**: Mandate
- **Normative Level**: SHALL
- **Text**: The cybersecurity program shall be based on the Covered Entity's Risk Assessment and designed to perform the following core cybersecurity functions...
- **Keywords**: Risk Assessment, core cybersecurity functions
---
Notes
- XLSX output is recommended for most use cases as it provides:
- Professional formatting with styled headers
- Auto-filter for easy data exploration
- Frozen header row for scrolling large datasets
- Optimized column widths for readability
- Text wrapping for long requirement text
- Handle tables within documents by processing each cell as potential requirement text
- Preserve cross-references between sections where they exist
- Flag ambiguous or incomplete requirements for manual review
- Support batch processing of multiple documents
- Maintain a processing log of any items that couldn't be automatically classified
XLSX Features
The XLSX export includes:
- Header styling: Bold white text on blue background
- Auto-filter: Filter requirements by any column
- Frozen panes: Header row stays visible when scrolling
- Column widths: Optimized for each field type (10-60 characters)
- Text wrapping: Long text automatically wraps for readability
- Borders: Clean grid lines for professional appearance
Supporting Files
See examples.md for sample input/output pairs and template.md for customizable output templates.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
Didn't find tool you were looking for?