Agent skill

web-search

Stars 16

Forks 23

Install this agent skill to your Project

npx add-skill https://github.com/treasure-data/td-skills/tree/main/studio-skills/web-search

SKILL.md

Web Search

Use web_search for research and page extraction. Powered by OpenAI web search (Bing-backed): search + extraction + summarization in one call.

Key Insight

The query is a prompt to an LLM with web access. Write instructions, not bare keywords.

✗ "Snowflake pricing"
✓ "List all pricing tiers from https://snowflake.com/pricing in a table with features"

Parameters

search_context_size: low — quick facts
search_context_size: medium — general research (default)
search_context_size: high — URL extraction, deep analysis

Use high whenever a specific URL is in the query.

Query Patterns

URL extraction — read a specific page and structure the output:

"Extract all items from https://example.com/page and list each with details in a table"

Structured research — request comparison tables, bullet summaries:

"Compare {A} vs {B} vs {C} covering pricing, features, and target audience"

Search operators — "exact phrase", site:domain.com, AND/OR, -exclude:

site:reddit.com "nextjs" vs "remix" experience
"{company}" AND ("funding" OR "revenue") 2025 2026

Translation — read non-English pages:

"Translate and summarize: https://example.co.jp/news/"

API endpoints — read JSON responses:

"Describe the JSON structure of https://api.github.com/repos/{owner}/{repo}"

Strengths

Bypasses bot protection (Reddit, G2, Gartner) via Bing's index
Extracts structured data (pricing, jobs, reviews, changelogs)
Reads and translates any language
Supports search operators (site:, AND/OR, "quoted")

Limitations

No verbatim full-text — summarizes copyrighted content instead of reproducing it
24-48h recency lag — not suitable for real-time monitoring
No PDF internals — filetype:pdf is unreliable
No auth pages — cannot access login-required content
Paywalls — reads public portions only

Workarounds for Limitations

When web_search falls short, suggest these alternatives to the user:

Real-time monitoring → RSS feeds (many sites expose /feed or /rss)
Raw HTML / interactive pages → Playwright (headless browser; click, scroll, screenshot)
Full-text extraction → Playwright to fetch raw HTML, then parse locally

Examples

Example: Extract page content

Input: "What jobs are open at Company X?" Action: "List all job openings from https://company-x.com/careers/ with title and location" (high)

Example: Research with multiple angles

Input: "Research Company X's market position" Action: Run in parallel:

"Company X market position analyst reports 2026" (medium)
"Extract key metrics from https://company-x.com/about" (high)
site:reddit.com "Company X" opinions experience (medium)

Example: Translate foreign page

Input: "What does this Japanese press release say?" Action: "Translate to English and summarize: https://example.co.jp/press/" (high)

Maintainer

treasure-data Core maintainer

Source details

Full Name: treasure-data/td-skills
Branch: main
Path in repo: studio-skills/web-search

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

treasure-data/td-skills

email-campaign

This skill should be used when the user asks to "create an email", "build an email campaign", "design an email template", "generate an email for a segment", "preview an email", or "push an email to Engage". Generates enterprise-grade HTML email templates with live preview in Treasure Studio and natural language editing, then pushes the final version to Treasure Engage.

16 23

Explore

treasure-data/td-skills

action-report

YAML format reference for action reports rendered via preview_action_report. MUST be read before writing any action report YAML — defines the report structure (title, summary, actions array) and action item fields (as_is, to_be, reason, priority, category, impact) with incremental build workflow. Required by seo-analysis and any skill that produces prioritized recommendations.

16 23

Explore

treasure-data/td-skills

grid-dashboard

YAML format reference for grid dashboards rendered via preview_grid_dashboard. MUST be read before writing any dashboard YAML — defines the page structure, 6 cell types (kpi, gauge, scores, table, chart, markdown), grid layout rules, cell merging syntax, and incremental build workflow. Required by seo-analysis and any skill that produces visual data dashboards.

16 23

Explore

treasure-data/td-skills

seo-analysis

Runs SEO and AEO (Answer Engine Optimization) analysis on websites or specific pages. Use when the user mentions SEO, AEO, search rankings, search optimization, or wants to analyze how their pages perform in search engines and AI answers. Produces a data dashboard and action report with before/after recommendations.

16 23

Explore

treasure-data/td-skills

aps-doc-core

Core documentation generation patterns and framework for Treasure Data pipeline layers. Provides shared templates, quality validation, testing framework, and Confluence integration used by all layer-specific documentation skills.

16 23

Explore

treasure-data/td-skills

aps-doc-id-unification

Expert documentation generation for ID unification layers. Documents identity resolution algorithms, merge strategies, match rules, entity graphs, and multi-workflow orchestration. Use when documenting ID unification processes.

16 23

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Web Search

Key Insight

Parameters

Query Patterns

Strengths

Limitations

Workarounds for Limitations

Examples

Example: Extract page content

Example: Research with multiple angles

Example: Translate foreign page

Recommended Agent Skills

email-campaign

action-report

grid-dashboard

seo-analysis

aps-doc-core

aps-doc-id-unification