Agent skill

methodology

Analyze captured HTTP traffic, design CLI architecture, and implement the Python CLI package. Covers Phase 2 of the pipeline: parse raw-traffic.json, identify protocol type, map endpoints, design Click command groups, implement with parallel subagents. TRIGGER when: "analyze traffic", "design CLI", "implement CLI", "build CLI from network traffic", "generate API wrapper", "reverse engineer web API", "start Phase 2", raw-traffic.json exists and capture is complete, or after the capture skill finishes. DO NOT trigger for: traffic recording (use capture), test writing (use testing), or quality checks (use standards).

Stars 137
Forks 29

Install this agent skill to your Project

npx add-skill https://github.com/ItamarZand88/CLI-Anything-WEB/tree/main/cli-anything-web-plugin/skills/methodology

SKILL.md

CLI-Anything-Web Methodology (Phase 2)

Analyze captured traffic, design the CLI command structure, and implement the complete Python CLI package. This skill owns the core transformation from raw HTTP traffic to a production-ready CLI.


Prerequisites (Hard Gate)

Do NOT start unless:

  • raw-traffic.json exists (with WRITE operations, or read-only GET-only traffic)
  • Auth state was captured during Phase 1 (if the site requires auth)

If raw-traffic.json is missing or has no WRITE operations, invoke the capture skill first.

Exception for read-only sites: If the site is genuinely read-only (search engine, dashboard, analytics viewer with no create/update/delete), the trace may contain only GET requests. In this case, note "read-only site — no write operations" in <APP>.md and proceed. The generated CLI will have read-only commands (list, get, search) but no create/update/delete commands. This is valid.

No-auth sites: If the target site requires no authentication (public API, no login needed), the "Auth state captured" prerequisite does not apply. Note "no-auth site" in <APP>.md and proceed.


Step A: Analyze (API Discovery)

Goal: Map raw traffic to a structured API model.

Process:

  1. Read traffic-analysis.json first (if it exists alongside raw-traffic.json). This file is auto-generated by parse-trace.py or mitmproxy-capture.pyanalyze-traffic.py and contains pre-detected protocol type, auth pattern, endpoint grouping, GraphQL operations, batchexecute RPC IDs, and suggested CLI commands. Use it as a starting point — verify its findings and fill in anything marked "unknown" by reading raw-traffic.json manually.

    Enhanced analysis (v1.3.0, when captured via mitmproxy-capture.py):

    • request_sequence: Timeline-ordered requests with auth flow detection (login → token → API calls)
    • session_lifecycle: Cookie inventory, auth cookie identification, session pattern (cookie_auth/token_refresh/no_session)
    • endpoint_sizes: Response body size classification per endpoint (small/medium/large) and total data transferred These fields are only present when mitmproxy-capture.py was used. If missing (has_timestamps: false), rely on manual analysis.

    If traffic-analysis.json doesn't exist, run the analyzer:

    bash
    python ${CLAUDE_PLUGIN_ROOT}/scripts/analyze-traffic.py \
      <app>/traffic-capture/raw-traffic.json --summary
    
  2. Parse raw-traffic.json (for details the analyzer couldn't extract)

  3. Group requests by base path (e.g., /api/v1/boards/, /api/v1/items/)

  4. For each endpoint group, identify:

    • HTTP method (GET/POST/PUT/DELETE/PATCH)
    • URL pattern (extract path parameters like :id)
    • Query parameters and their types
    • Request body schema (JSON fields, types, required/optional)
    • Response body schema
    • Authentication method (Bearer token, cookie, API key)
    • Rate limiting signals (429 responses, retry-after headers)
  5. Identify RPC protocol type -- classify the API transport:

    Protocol Detection Signal Client Pattern
    REST Resource URLs (/api/v1/boards/:id), standard HTTP methods client.py with method-per-endpoint
    GraphQL Single /graphql endpoint, query/mutation in body client.py with query templates
    gRPC-Web application/grpc-web content type, binary payloads Proto-based client
    Google batchexecute batchexecute in URL, f.req= body, )]}'\n prefix rpc/ subpackage (see references/google-batchexecute.md)
    Custom RPC Single endpoint, method name in body, proprietary encoding Custom codec module
    Public REST API Documented /api/ endpoints, OpenAPI spec, JSON responses Standard client.py with httpx
    Plain HTML (no framework) No SPA root, no framework globals, data in <table>/<div> client.py with httpx + BeautifulSoup4

    This determines client architecture in Step B -- REST uses simple client.py, non-REST protocols need a dedicated rpc/ subpackage with encoder/decoder/types.

  6. Detect data model:

    • Entity types (boards, items, users, projects...)
    • Relationships (board has many items, item belongs to board)
    • ID formats (UUID, numeric, slug)
  7. Detect auth pattern:

    • Cookie-based sessions
    • Bearer/JWT tokens
    • OAuth refresh flow
    • API key headers
    • Browser-delegated auth: tokens embedded in page JavaScript (e.g., WIZ_global_data), not in HTTP headers. Requires CDP for initial cookies, HTTP for token extraction. See references/auth-strategies.md "Browser-Delegated Auth" section.
    • No auth / public access: fully public API, no login required. CLI may optionally support API key auth for write operations (e.g., dev.to).
  8. Write <APP>.md -- software-specific SOP document

Output: <APP>.md with API map, data model, auth scheme.

References: traffic-patterns.md, google-batchexecute.md, ssr-patterns.md


Step B: Implement (Code Generation)

Study Existing CLIs First (Critical for Accuracy)

Before implementing, read an existing CLI that uses the same protocol as your target. These are battle-tested implementations that solved the same problems you'll face.

Protocol Reference CLI Key files to read
Google batchexecute notebooklm/agent-harness/cli_web/notebooklm/ core/rpc/encoder.py, core/rpc/decoder.py, core/client.py, core/auth.py
GraphQL + WAF booking/agent-harness/cli_web/booking/ core/client.py (curl_cffi + GraphQL), core/auth.py (WAF tokens)
HTML scraping futbin/agent-harness/cli_web/futbin/ core/client.py (httpx + BS4), commands/players.py
HTML + Cloudflare producthunt/agent-harness/cli_web/producthunt/ core/client.py (curl_cffi impersonate)
REST API unsplash/agent-harness/cli_web/unsplash/ core/client.py, commands/photos.py
Simple HTML gh-trending/agent-harness/cli_web/gh_trending/ Minimal structure example

How to use reference CLIs:

  1. Read the reference CLI's core/client.py — understand the request/response pattern
  2. Read core/auth.py — copy the login_browser() pattern exactly for Google apps
  3. Read core/rpc/ (for batchexecute) — understand encoder/decoder, DO NOT reinvent
  4. Read commands/ — see how Click commands are structured, how --json works
  5. Read utils/helpers.py — see handle_errors(), _resolve_cli(), repl patterns

For batchexecute apps specifically, the notebooklm CLI is your bible:

  • Copy the encoder/decoder architecture (don't reinvent the batchexecute wire format)
  • Copy the auth token extraction pattern (CSRF, session ID, build label)
  • Copy the cookie domain priority logic (critical for Israeli/international users)
  • Adapt the RPC method IDs and param structures to your target app

The agent implementing the CLI MUST read these files before writing code. Use the Agent tool to dispatch a research agent that reads the reference implementation while you design the command structure.

Design Before You Code

Before writing any code, note the command structure in <APP>.md (10 minutes max):

  • Map each API endpoint group to a Click command group:
    • /api/v1/boards/*boards command group
    • /api/v1/items/*items command group
  • Map CRUD operations to subcommands (GET list → list, GET single → get, POST → create, PUT/PATCH → update, DELETE → delete)
  • Note auth design: auth login, auth status, auth refresh; credentials at ~/.config/cli-web-<app>/auth.json
  • Note REPL design: bare command enters REPL, branded banner via repl_skin.py

Goal: Generate the complete Python CLI package.

Package Structure

See HARNESS.md "Generated CLI Structure" for the complete package template. Key points: cli_web/ namespace (NO __init__.py), <app>/ sub-package (HAS __init__.py), core/, commands/, utils/, tests/ directories.

Step B.0: Scaffold Core Modules

Run the scaffold generator script to create all boilerplate files:

bash
python ${CLAUDE_PLUGIN_ROOT}/scripts/scaffold-cli.py <app>/agent-harness \
  --app-name <app> \
  --protocol <rest|graphql|html-scraping|batchexecute> \
  --http-client <httpx|curl_cffi> \
  --auth-type <none|cookie|api-key|google-sso> \
  --resources <comma-separated-resources> \
  [--has-polling] [--has-context] [--has-partial-ids]

This generates exceptions.py, client.py skeleton, helpers.py, config.py, output.py, the CLI entry point with REPL, setup.py, conftest.py, repl_skin.py, and (for batchexecute) the rpc/ subpackage.

Fallback: If the script is unavailable, read ${CLAUDE_PLUGIN_ROOT}/skills/boilerplate/SKILL.md and follow its instructions to scaffold manually.

After scaffolding, review the generated files and customize client.py with actual endpoint methods from <APP>.md.

Implementation Rules

  • exceptions.py -- implement first. Required types: AppError (base), AuthError(recoverable), RateLimitError(retry_after), NetworkError, ServerError(status_code), NotFoundError. See references/exception-hierarchy-example.py for the complete template.

  • client.py -- HTTP client with exception mapping and auth retry:

    • HTTP library choice:
      • httpx (default) — for most sites (REST, GraphQL, batchexecute)
      • curl_cffi — for Cloudflare-protected sites. Uses Chrome TLS fingerprint impersonation to bypass bot detection without cookies or auth:
        python
        from curl_cffi import requests as curl_requests
        resp = curl_requests.get(url, impersonate="chrome")
        
        Use curl_cffi when Phase 1 detects Cloudflare (cf-ray header, challenge page). Add curl_cffi, beautifulsoup4 to setup.py instead of httpx.
    • Centralized auth header/cookie injection
    • Automatic JSON parsing with response body verification
    • Status code → exception mapping: 401/403→AuthError, 404→NotFoundError, 429→RateLimitError, 5xx→ServerError
    • Auth retry: On AuthError(recoverable=True), refresh tokens and retry once
    • Exponential backoff for rate limits (see references/polling-backoff-example.py)
    • For apps with 3+ resource types: split into namespaced sub-clients (client.notebooks.list(), client.sources.add())
    • See references/client-architecture-example.py for the full pattern
  • auth.py -- handles token storage, refresh, expiry. Implementation depends on auth type:

    For no-auth sites: DO NOT create auth.py, session.py, or auth command groups. These files are dead code for public APIs and confuse users. The CLI should have NO auth-related files or commands. The only exception is if the site has optional auth (e.g., API key for write operations) — in that case, implement a minimal auth module.

    For browser-delegated auth (Google, Microsoft, etc.): Full playwright-cli login flow with cookie domain priority for international users.

    See references/auth-strategies.md for all patterns (browser login, cookie priority, API key, env var, context commands). Store cookies at ~/.config/cli-web-<app>/auth.json with chmod 600.

  • Anti-bot resilient client construction (when detected in Phase 2):

    • Extract session tokens via CDP first (cookies), then HTTP GET + HTML parsing (CSRF, session IDs)
    • Never hardcode build labels (bl), session IDs (f.sid), or CSRF tokens -- extract dynamically at runtime
    • Replicate same-origin headers captured during Phase 1 traffic (e.g., x-same-domain: 1 for Google apps)
    • Implement auto-retry on 401/403: re-fetch homepage -> re-extract tokens -> retry once
    • See references/google-batchexecute.md for the complete Google pattern
  • RPC codec subpackage (for non-REST protocols like batchexecute): When the API uses a non-REST protocol, add core/rpc/ with:

    • types.py -- method ID enum, URL constants
    • encoder.py -- request encoding (protocol-specific format)
    • decoder.py -- response decoding (strip prefix, parse chunks, extract results) The client.py still exists but delegates encoding/decoding to rpc/.
  • Progress feedback -- Use rich>=13.0 spinners for operations >2s (suppress in --json mode). See references/rich-output-example.py.

  • JSON error output -- --json mode errors are JSON too, not plain text. Standard codes: AUTH_EXPIRED, RATE_LIMITED, NOT_FOUND, SERVER_ERROR, NETWORK_ERROR. Implement via utils/output.py json_error().

  • All commands use handle_errors(json_mode) context manager — centralizes error handling, exit codes (1=user, 2=system, 130=interrupt), and JSON errors. See references/helpers-module-example.py.

  • Generation commands support --wait, --retry N, --output path — for agent-scriptable end-to-end workflows. See references/polling-backoff-example.py.

  • Windows UTF-8 fix — Add at the top of <app>_cli.py before any imports that print:

    python
    import sys
    if sys.stdout.encoding and sys.stdout.encoding.lower() not in ("utf-8", "utf8"):
        try: sys.stdout.reconfigure(encoding="utf-8", errors="replace")
        except AttributeError: pass
    
  • HTML table parsers MUST extract ALL visible columns — not just name/price, because missing fields in --json output make the CLI useless for filtering and analysis. If the site shows version, club, nation, stats, skills, weak foot — parse all of them. Empty fields in --json output = incomplete parser.

  • Entry point: cli-web-<app> via setup.py console_scripts

  • Namespace: cli_web.*

  • Copy repl_skin.py from plugin for consistent REPL experience

  • utils/helpers.py -- shared CLI helpers (generate for every CLI):

    • resolve_partial_id(partial, items) — prefix-match UUIDs for get/rename/delete
    • handle_errors(json_mode) — context manager replacing try/except in all commands
    • require_notebook(notebook_arg) — gets notebook ID from arg or persistent context
    • sanitize_filename(name) — safe filenames from artifact titles
    • poll_until_complete(check_fn) — exponential backoff polling
    • get_context_value(key) / set_context_value(key, value) — persistent context.json See references/helpers-module-example.py for the complete module.

Not all helpers apply to every CLI. Include only what the CLI uses: handle_errors and print_json are always needed. resolve_partial_id only for UUID-based apps. require_notebook/context helpers only for apps with persistent context. poll_until_complete only for generation/async operations.

REPL Implementation Rules (Critical)

These three bugs appear in almost every generated REPL. Get them right the first time:

1. Use shlex.split(), never line.split()

python
# ✓ Correct — handles quoted args: players search "messi" -> ['players', 'search', 'messi']
import shlex
args = shlex.split(line)

# ✗ Wrong — produces: ['players', 'search', '"messi"'] — quotes become part of the value
args = line.split()

2. Never pass **ctx.params to cli.main() in REPL dispatch

python
# ✓ Correct — preserve --json flag by prepending to args
repl_args = ["--json"] + args if ctx.obj.get("json") else args
cli.main(args=repl_args, standalone_mode=False)

# ✗ Wrong — ctx.params = {"json_mode": False} gets passed to Context.__init__()
# which doesn't accept it → TypeError: Context.__init__() got an unexpected
# keyword argument 'json_mode'
cli.main(args=args, standalone_mode=False, **ctx.params)

3. Keep _print_repl_help() in sync with the actual command surface

The _print_repl_help() function in <app>_cli.py is the user's first discovery surface — it's what they see when they type help in the REPL. It must mirror the real commands, including all key options. A REPL that shows outdated or incomplete help is confusing and makes the CLI feel broken.

python
# ✓ Correct — help lists actual options users can pass
def _print_repl_help():
    _skin.info("Available commands:")
    print("  players list [OPTIONS]")
    print("    --position <GK|ST|CM|...>    Filter by position")
    print("    --rating-min N --rating-max N  Rating range")
    print("    --cheapest                   Sort cheapest first")

# ✗ Wrong — stale help doesn't mention new --position, --rating-min, etc.
def _print_repl_help():
    print("  players list [--min-price N]   List players with filters")

Rule: every time you add options to a command, update _print_repl_help() in the same commit.


4. Use @click.argument for positional REPL params, not @click.option("--x", required=True)

REPL commands show players search <query> in help. If query is a --query option, users typing players search messi get "Error: Missing option '--query'". Use positional arguments for natural command-line style:

python
# ✓ Correct — users type: players search messi  OR  players get 21610
@players.command()
@click.argument("query")
def search(query): ...

@players.command()
@click.argument("player_id", type=int)
def get(player_id): ...

# ✗ Wrong — users get an error unless they type: players search --query messi
@players.command()
@click.option("--query", required=True)
def search(query): ...

Rule of thumb: if a command takes a single required value that would be a positional arg in a shell command (git checkout main, grep pattern), use @click.argument. Use @click.option only for optional or named parameters (--rating-min, --platform).

Parallel Implementation (dispatch independent modules as subagents)

When the CLI has 3+ command groups (e.g., notebooks, sources, chat, artifacts), dispatch parallel subagents -- one per command module. Each agent gets:

  • The <APP>.md API spec for its resource
  • The client.py and auth.py interfaces it depends on
  • Clear scope: "Implement commands/notebooks.py with list, get, create, delete"

Parallelization opportunities:

Independent from each other Dispatch in parallel
commands/notebooks.py, commands/sources.py, commands/chat.py Yes -- each command file only depends on client.py
rpc/encoder.py and rpc/decoder.py Yes -- encoder doesn't depend on decoder
auth.py and models.py Yes -- no shared logic
client.py and commands/* No -- commands depend on client
<app>_cli.py (entry point) Last -- imports all commands, write after they're done

Implementation order (with maximum parallelism):

Phase A (sequential): Write core foundation
  exceptions.py → client.py → auth.py (if needed) → models.py

Phase B (parallel): Dispatch ALL independent work simultaneously
  ┌─ Agent 1: commands/notebooks.py
  ├─ Agent 2: commands/sources.py
  ├─ Agent 3: commands/chat.py
  ├─ Agent 4: commands/artifacts.py
  ├─ Agent 5: rpc/encoder.py + rpc/decoder.py (if non-REST)
  └─ Agent 6 (background): test_core.py (unit tests for core modules)
  All run concurrently — each only depends on Phase A modules

Phase C (sequential): Wire everything together
  utils/helpers.py → <app>_cli.py → __main__.py → setup.py → copy repl_skin.py

Key parallelism rules:

  • Dispatch independent command modules as parallel subagents (one per commands/*.py file)
  • Start unit test writing as a background agent during command implementation
  • Entry point (<app>_cli.py, setup.py) must come last (depends on all commands)

Mandatory Smoke Check (Before Testing Phase)

Before invoking testing, install (pip install -e .) and verify:

  1. cli-web-<app> --help loads
  2. cli-web-<app> auth status --json shows valid (if auth-required)
  3. cli-web-<app> <resource> list --json returns real data
  4. One WRITE command works (if applicable)

Red flags — fix before testing:

  • wrb.fr, af.httprm in output → decoder broken
  • [] or null where data expected → wrong params or client-side operation
  • Wrong field values (e.g., "3" instead of prompt text) → parser index mismatch
  • Null write response → may be client-side, see references/google-batchexecute.md "Client-Side Operations"

Update phase state:

bash
python ${CLAUDE_PLUGIN_ROOT}/scripts/phase-state.py complete <app> \
  --phase methodology --output <app>/agent-harness/

Next Step

When implementation is complete and the smoke check passes, invoke the testing skill to plan and write tests.

Do NOT skip testing -- every CLI must have comprehensive tests before publishing.


Companion Skills

Skill When it activates
capture Phase 1 -- traffic recording (prerequisite for this skill)
testing Phase 3 -- test writing, documentation
standards Phase 4 -- publish, verify, smoke test

Integration

Relationship Skill
Preceded by capture (Phase 1)
Followed by testing (Phase 3)
References traffic-patterns.md, auth-strategies.md, google-batchexecute.md, ssr-patterns.md, exception-hierarchy-example.py, client-architecture-example.py, polling-backoff-example.py, rich-output-example.py

Reference Files

  • references/traffic-patterns.md -- Common API patterns (REST, GraphQL, RPC)
  • references/auth-strategies.md -- Auth implementation strategies
  • references/google-batchexecute.md -- Google batchexecute RPC protocol spec
  • references/ssr-patterns.md -- SSR framework patterns and data extraction strategies
  • references/exception-hierarchy-example.py -- Complete exception hierarchy with HTTP status mapping
  • references/client-architecture-example.py -- Namespaced sub-client pattern with auth retry
  • references/polling-backoff-example.py -- Exponential backoff polling and rate-limit retry
  • references/rich-output-example.py -- Rich progress bars, JSON error responses, table formatting

Expand your agent's capabilities with these related and highly-rated skills.

ItamarZand88/CLI-Anything-WEB

airbnb-cli

Use cli-web-airbnb to search Airbnb stays, get listing details, check availability calendars, read guest reviews, and look up location suggestions. Invoke this skill whenever the user asks about Airbnb accommodations, vacation rentals, listing prices, availability, guest reviews, or wants to search for places to stay. Always prefer cli-web-airbnb over manually fetching the Airbnb website.

137 29
Explore
ItamarZand88/CLI-Anything-WEB

chatgpt-cli

Use cli-web-chatgpt to ask ChatGPT questions, generate images, download images, list conversations, browse models, and manage authentication. Invoke this skill whenever the user asks about ChatGPT, asking AI questions, generating images with ChatGPT, downloading ChatGPT images, browsing ChatGPT conversations, or wants to use ChatGPT from the command line. Always prefer cli-web-chatgpt over manually browsing chatgpt.com.

137 29
Explore
ItamarZand88/CLI-Anything-WEB

notebooklm-cli

Use cli-web-notebooklm to interact with Google NotebookLM — create notebooks, add sources, ask questions, generate artifacts (audio, video, slides, mindmap, study guide, quiz, briefing, infographic, data table). Invoke this skill whenever the user asks about NotebookLM, wants to create notebooks, add sources to a notebook, ask a notebook questions, generate study materials, create presentations, podcasts, or manage NotebookLM content programmatically. Always prefer cli-web-notebooklm over manually browsing NotebookLM.

137 29
Explore
ItamarZand88/CLI-Anything-WEB

unsplash-cli

Use cli-web-unsplash to answer questions about Unsplash photos, search for free images by keyword, download photos, browse photo topics and collections, view photographer profiles, get photo details (EXIF, location, tags), and discover random photos. Invoke this skill whenever the user asks about Unsplash, free stock photos, searching for images, downloading images, photo topics, photographer profiles, photo collections, or wants to find or download images by keyword, orientation, or color. Always prefer cli-web-unsplash over manually fetching the Unsplash website.

137 29
Explore
ItamarZand88/CLI-Anything-WEB

futbin-cli

Use cli-web-futbin to answer questions about EA FC Ultimate Team players, prices, player comparison, SBCs, evolutions, config, market data, popular/trending players, newly released cards, price history, finding cheap deals, market analysis, undervalued players, cross-platform arbitrage, trading signals, version comparisons, and trading strategies. Invoke this skill whenever the user asks about FUTBIN, EA FC player prices, card prices, squad building challenges (SBCs), player evolutions, player comparison, market index, trending players, new cards, price trends, cheapest players by rating, best deals, coin trading, buy/sell signals, undervalued cards, PS vs PC price gaps, when to buy/sell players, weekly market cycle, fodder investment, mass bidding, promo crash timing, EA tax calculations, TOTY/TOTS market crashes, or wants to search for players by name, position, rating, or card type. Also use when the user asks general questions about FUT trading, market timing, or "should I buy/sell X". Always prefer cli-web-futbin over manually fetching the FUTBIN website. Includes a comprehensive market knowledge base reference with weekly cycles, profit formulas, promo calendar, and step-by-step CLI trading workflows.

137 29
Explore
ItamarZand88/CLI-Anything-WEB

hackernews-cli

Use cli-web-hackernews to browse and interact with Hacker News — top stories, newest, best, Ask HN, Show HN, jobs, search stories/comments, view story details with comments, user profiles, and (with auth) upvote, submit stories, post comments, favorite, hide, view favorites, submissions, and comment threads. Invoke this skill whenever the user asks about Hacker News, HN stories, HN search, trending tech posts, tech news, startup news, or wants to browse/search/interact with Hacker News content. Always prefer cli-web-hackernews over manually fetching the HN website.

137 29
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results