Agent skill
web-fetch
Fetches web content with intelligent content extraction, converting HTML to clean markdown. Use for documentation, articles, and reference pages http/https URLs.
Install this agent skill to your Project
npx add-skill https://github.com/aiskillstore/marketplace/tree/main/skills/0xbigboss/web-fetch
SKILL.md
Web Content Fetching
Fetch web content using curl | html2markdown with CSS selectors for clean, complete markdown output.
Quick Usage (Known Sites)
Use site-specific selectors for best results:
# Anthropic docs
curl -s "<url>" | html2markdown --include-selector "#content-container"
# MDN Web Docs
curl -s "<url>" | html2markdown --include-selector "article"
# GitHub docs
curl -s "<url>" | html2markdown --include-selector "article" --exclude-selector "nav,.sidebar"
# Generic article pages
curl -s "<url>" | html2markdown --include-selector "article,main,[role=main]" --exclude-selector "nav,header,footer"
Site Patterns
| Site | Include Selector | Exclude Selector |
|---|---|---|
| platform.claude.com | #content-container |
- |
| docs.anthropic.com | #content-container |
- |
| developer.mozilla.org | article |
- |
| github.com (docs) | article |
nav,.sidebar |
| Generic | article,main |
nav,header,footer,script,style |
Universal Fallback (Unknown Sites)
For sites without known patterns, use the Bun script which auto-detects content:
bun ~/.claude/skills/web-fetch/fetch.ts "<url>"
Setup (one-time)
cd ~/.claude/skills/web-fetch && bun install
Finding the Right Selector
When a site isn't in the patterns list:
# Check what content containers exist
curl -s "<url>" | grep -o '<article[^>]*>\|<main[^>]*>\|id="[^"]*content[^"]*"' | head -10
# Test a selector
curl -s "<url>" | html2markdown --include-selector "<selector>" | head -30
# Check line count
curl -s "<url>" | html2markdown --include-selector "<selector>" | wc -l
Options Reference
--include-selector "CSS" # Only include matching elements
--exclude-selector "CSS" # Remove matching elements
--domain "https://..." # Convert relative links to absolute
Comparison
| Method | Anthropic Docs | Code Blocks | Complexity |
|---|---|---|---|
| Full page | 602 lines | Yes | Noisy |
--include-selector "#content-container" |
385 lines | Yes | Clean |
| Bun script (universal) | 383 lines | Yes | Clean |
Troubleshooting
Wrong content selected: The site may have multiple articles. Inspect the HTML:
curl -s "<url>" | grep -o '<article[^>]*>'
Empty output: The selector doesn't match. Try broader selectors like main or body.
Missing code blocks: Check if the site uses non-standard code formatting.
Client-rendered content: If HTML only has "Loading..." placeholders, the content is JS-rendered. Neither curl nor the Bun script can extract it; use browser-based tools.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
perigon-backend
Perigon ASP.NET Core + EF Core + Aspire conventions
perigon-agent
Pointers for Copilot/agents to apply Perigon conventions
perigon-angular
Angular 21+ standalone/Material/signal conventions for Perigon WebApp
fastapi-mastery
Comprehensive FastAPI development skill covering REST API creation, routing, request/response handling, validation, authentication, database integration, middleware, and deployment. Use when working with FastAPI projects, building APIs, implementing CRUD operations, setting up authentication/authorization, integrating databases (SQL/NoSQL), adding middleware, handling WebSockets, or deploying FastAPI applications. Triggered by requests involving .py files with FastAPI code, API endpoint creation, Pydantic models, or FastAPI-specific features.
context7-efficient
Token-efficient library documentation fetcher using Context7 MCP with 86.8% token savings through intelligent shell pipeline filtering. Fetches code examples, API references, and best practices for JavaScript, Python, Go, Rust, and other libraries. Use when users ask about library documentation, need code examples, want API usage patterns, are learning a new framework, need syntax reference, or troubleshooting with library-specific information. Triggers include questions like "Show me React hooks", "How do I use Prisma", "What's the Next.js routing syntax", or any request for library/framework documentation.
browser-use
Browser automation using Playwright MCP. Navigate websites, fill forms, click elements, take screenshots, and extract data. Use when tasks require web browsing, form submission, web scraping, UI testing, or any browser interaction.
Didn't find tool you were looking for?