Agent skill
web-search
Install this agent skill to your Project
npx add-skill https://github.com/treasure-data/td-skills/tree/main/studio-skills/web-search
SKILL.md
Web Search
Use web_search for research and page extraction. Powered by OpenAI web search (Bing-backed): search + extraction + summarization in one call.
Key Insight
The query is a prompt to an LLM with web access. Write instructions, not bare keywords.
✗ "Snowflake pricing"
✓ "List all pricing tiers from https://snowflake.com/pricing in a table with features"
Parameters
search_context_size: low— quick factssearch_context_size: medium— general research (default)search_context_size: high— URL extraction, deep analysis
Use high whenever a specific URL is in the query.
Query Patterns
URL extraction — read a specific page and structure the output:
"Extract all items from https://example.com/page and list each with details in a table"
Structured research — request comparison tables, bullet summaries:
"Compare {A} vs {B} vs {C} covering pricing, features, and target audience"
Search operators — "exact phrase", site:domain.com, AND/OR, -exclude:
site:reddit.com "nextjs" vs "remix" experience
"{company}" AND ("funding" OR "revenue") 2025 2026
Translation — read non-English pages:
"Translate and summarize: https://example.co.jp/news/"
API endpoints — read JSON responses:
"Describe the JSON structure of https://api.github.com/repos/{owner}/{repo}"
Strengths
- Bypasses bot protection (Reddit, G2, Gartner) via Bing's index
- Extracts structured data (pricing, jobs, reviews, changelogs)
- Reads and translates any language
- Supports search operators (
site:,AND/OR,"quoted")
Limitations
- No verbatim full-text — summarizes copyrighted content instead of reproducing it
- 24-48h recency lag — not suitable for real-time monitoring
- No PDF internals —
filetype:pdfis unreliable - No auth pages — cannot access login-required content
- Paywalls — reads public portions only
Workarounds for Limitations
When web_search falls short, suggest these alternatives to the user:
- Real-time monitoring → RSS feeds (many sites expose
/feedor/rss) - Raw HTML / interactive pages → Playwright (headless browser; click, scroll, screenshot)
- Full-text extraction → Playwright to fetch raw HTML, then parse locally
Examples
Example: Extract page content
Input: "What jobs are open at Company X?"
Action: "List all job openings from https://company-x.com/careers/ with title and location" (high)
Example: Research with multiple angles
Input: "Research Company X's market position" Action: Run in parallel:
"Company X market position analyst reports 2026"(medium)"Extract key metrics from https://company-x.com/about"(high)site:reddit.com "Company X" opinions experience(medium)
Example: Translate foreign page
Input: "What does this Japanese press release say?"
Action: "Translate to English and summarize: https://example.co.jp/press/" (high)
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
email-campaign
This skill should be used when the user asks to "create an email", "build an email campaign", "design an email template", "generate an email for a segment", "preview an email", or "push an email to Engage". Generates enterprise-grade HTML email templates with live preview in Treasure Studio and natural language editing, then pushes the final version to Treasure Engage.
action-report
YAML format reference for action reports rendered via preview_action_report. MUST be read before writing any action report YAML — defines the report structure (title, summary, actions array) and action item fields (as_is, to_be, reason, priority, category, impact) with incremental build workflow. Required by seo-analysis and any skill that produces prioritized recommendations.
grid-dashboard
YAML format reference for grid dashboards rendered via preview_grid_dashboard. MUST be read before writing any dashboard YAML — defines the page structure, 6 cell types (kpi, gauge, scores, table, chart, markdown), grid layout rules, cell merging syntax, and incremental build workflow. Required by seo-analysis and any skill that produces visual data dashboards.
seo-analysis
Runs SEO and AEO (Answer Engine Optimization) analysis on websites or specific pages. Use when the user mentions SEO, AEO, search rankings, search optimization, or wants to analyze how their pages perform in search engines and AI answers. Produces a data dashboard and action report with before/after recommendations.
aps-doc-core
Core documentation generation patterns and framework for Treasure Data pipeline layers. Provides shared templates, quality validation, testing framework, and Confluence integration used by all layer-specific documentation skills.
aps-doc-id-unification
Expert documentation generation for ID unification layers. Documents identity resolution algorithms, merge strategies, match rules, entity graphs, and multi-workflow orchestration. Use when documenting ID unification processes.
Didn't find tool you were looking for?