Agent skill

agent-telemetry

Make application behavior visible to coding agents by exposing structured logs and telemetry. Use when asked to "add telemetry", "make logs accessible to agents", "add observability", "debug with logs", or when an agent needs to understand runtime behavior but has no way to query logs. Also use when debugging is difficult because there are no structured logs, when agent docs (CLAUDE.md, AGENTS.md) lack instructions for querying application logs, or when setting up logging infrastructure for a new or existing web application.

Stars 3
Forks 1

Install this agent skill to your Project

npx add-skill https://github.com/petekp/agent-skills/tree/main/skills/agent-telemetry

SKILL.md

Agent Telemetry

Make application runtime behavior queryable by coding agents through structured logging and telemetry endpoints.

Core Problem

Coding agents debugging issues often can't answer "what actually happened at runtime?" because:

  • Logs don't exist, or are unstructured console.log noise
  • Logs exist but there's no documented way for agents to query them
  • Agent docs (CLAUDE.md, AGENTS.md) don't mention how to access telemetry

Workflow

Phase 1: Audit Current State

Determine what telemetry already exists.

1. Check for logging infrastructure:

bash
# Find logging configuration and usage
grep -r "winston\|pino\|bunyan\|log4j\|slog\|Logger\|logging\.config" --include="*.{ts,js,py,rb,go,rs}" -l .
bash
# Find log output configuration
grep -r "LOG_LEVEL\|LOG_FORMAT\|LOG_FILE\|OTEL_\|SENTRY_DSN" .env* config/ -l 2>/dev/null

2. Check for existing telemetry endpoints:

bash
# Health/debug/metrics endpoints
grep -r "health\|metrics\|debug\|status\|readiness\|liveness" --include="*.{ts,js,py,rb,go}" -l src/ app/ 2>/dev/null

3. Check agent docs for log access instructions:

bash
# Do agent docs mention logs?
grep -ri "log\|telemetry\|debug\|observ" CLAUDE.md AGENTS.md .claude/*.md .cursor/*.md 2>/dev/null

4. Classify the result:

Finding Action
No structured logging exists Go to Phase 2
Logging exists but no agent access Go to Phase 3
Logging + access exists but undocumented Go to Phase 4
Everything in place Validate and suggest improvements

Phase 2: Add Structured Logging

If no structured logging exists, add it. See references/logging-setup.md for framework-specific patterns.

Principles:

  • Use structured JSON logs, not string interpolation
  • Include correlation IDs for request tracing
  • Log at boundaries: incoming requests, outgoing calls, errors, state transitions
  • Use consistent field names: timestamp, level, message, requestId, userId, duration, error

Where to add logging (priority order):

  1. Request/response middleware (every request gets logged)
  2. Error handlers (unhandled errors get captured with context)
  3. External service calls (DB queries, API calls, queue operations)
  4. Business logic decision points (state transitions, authorization decisions)

Minimum viable logging — add a request logger middleware that captures:

{timestamp, level, requestId, method, path, statusCode, duration, userId?}

This single addition makes most debugging possible.

Phase 3: Expose Logs to Agents

Agents need a way to query logs without SSH access or cloud console dashboards. Provide at least one of:

Option A: Log file (simplest) Write structured logs to a known file path agents can read directly.

# Agent reads recent errors
tail -100 logs/app.json | jq 'select(.level == "error")'

# Agent reads logs for a specific request
grep "requestId.*abc123" logs/app.json | jq .

Option B: Dev log endpoint (recommended for web apps) Add a development-only endpoint that returns recent log entries with filtering.

GET /__dev/logs?level=error&last=50
GET /__dev/logs?path=/api/users&last=20
GET /__dev/logs?requestId=abc-123

This endpoint must:

  • Only be available in development (NODE_ENV=development or equivalent)
  • Return JSON array of log entries
  • Support filtering by level, path, timerange, requestId
  • Limit response size (default 100 entries)

See references/dev-endpoint.md for implementation patterns by framework.

Option C: CLI query tool Wrap log access in a script agents can execute:

bash
# Query recent errors
./scripts/query-logs.sh --level error --last 50

# Query by request path
./scripts/query-logs.sh --path /api/users --since "5 minutes ago"

Choose based on project context:

Project Type Best Option
Next.js / Express / Rails with local dev Option B (dev endpoint)
CLI tool or background worker Option A (log file)
Docker-based development Option A (mounted log volume) or Option C
Monorepo with multiple services Option C (unified query script)

Phase 4: Document in Agent Docs

This is critical. Without documentation, agents won't know telemetry exists.

Update CLAUDE.md (or equivalent agent doc) with a Debugging section:

markdown
## Debugging

### Querying Application Logs

Structured JSON logs are available at [location].

**Quick commands:**

```bash
# View recent errors
[command to view errors]

# View logs for a specific endpoint
[command to filter by path]

# View logs for a specific request
[command to filter by request ID]

# View logs from the last N minutes
[command to filter by time]

Log format:

json
{
  "timestamp": "ISO-8601",
  "level": "info|warn|error",
  "message": "Human-readable description",
  "requestId": "correlation-id",
  "method": "GET",
  "path": "/api/resource",
  "statusCode": 200,
  "duration": 45
}

Common debugging workflows:

  • User reports error → query by time range and error level
  • Flaky test → query by endpoint path during test run
  • Performance issue → query by path, sort by duration

**Key rules for the documentation:**
- Include copy-pasteable commands (agents execute, not read)
- Show the log schema so agents know what fields to filter on
- List 3-4 common debugging workflows with exact commands
- Mention where log config lives for agents that need to adjust log levels

### Phase 5: Validate

Test the full loop:

1. **Trigger a request** — hit an endpoint or run an operation
2. **Query the logs** — use the documented method to find the log entry
3. **Verify agent usability** — can an agent find the relevant log in <3 commands?
4. **Check error capture** — trigger an error and verify it appears with full context

If any step fails, iterate on the logging or documentation.

## Anti-Patterns

| Anti-Pattern | Why It's Bad | Do Instead |
|-------------|-------------|-----------|
| `console.log("here")` | No structure, no context, no filtering | Structured JSON with consistent fields |
| Logs only in cloud dashboard | Agents can't access Datadog/CloudWatch | Local file or dev endpoint |
| Log everything at debug level | Too noisy, can't find signal | Log at boundaries, use appropriate levels |
| Logging sensitive data | PII in logs is a liability | Redact tokens, passwords, PII |
| No request correlation | Can't trace a request across log lines | Add requestId to every log entry |
| Docs say "check the logs" with no how | Agent doesn't know where or how | Exact commands with examples |

Expand your agent's capabilities with these related and highly-rated skills.

petekp/agent-skills

multi-model-meta-analysis

Synthesize outputs from multiple AI models into a comprehensive, verified assessment. Use when: (1) User pastes feedback/analysis from multiple LLMs (Claude, GPT, Gemini, etc.) about code or a project, (2) User wants to consolidate model outputs into a single reliable document, (3) User needs conflicting model claims resolved against actual source code. This skill verifies model claims against the codebase, resolves contradictions with evidence, and produces a more reliable assessment than any single model.

3 1
Explore
petekp/agent-skills

capture-learning

Analyze recent conversation context and capture learnings to project knowledge files (for project-specific insights) or skills/commands/subagents (for cross-project patterns). Use when the user asks to "capture this learning", "update the docs with this", "remember this for next time", "document this issue", "add this to CLAUDE.md", "save this knowledge", or "update project knowledge". Also triggers after resolving build/setup issues, discovering non-obvious patterns, or completing debugging sessions with valuable insights.

3 1
Explore
petekp/agent-skills

optimize-agent-docs

Build a retrieval-optimized knowledge layer over agent documentation in dotfiles (.claude, .codex, .cursor, .aider). Use when asked to "optimize docs", "improve agent knowledge", "make docs more efficient", or when documentation has accumulated and retrieval feels inefficient. Generates a manifest mapping task-contexts to knowledge chunks, optimizes information density, and creates compiled artifacts for efficient agent consumption.

3 1
Explore
petekp/agent-skills

agent-changelog

Compile an agent-optimized changelog by cross-referencing git history with plans and documentation. Use when asked to "update changelog", "compile history", "document project evolution", or proactively after major milestones, architectural changes, or when stale/deprecated information is detected that could confuse coding agents.

3 1
Explore
petekp/agent-skills

literate-guide

Create a narrative guide to a codebase or feature in the style of Knuth's Literate Programming — code and prose interwoven as a single essay, ordered for human understanding rather than compiler needs. Use when the user asks to 'explain this codebase as a story', 'write a literate guide', 'create a narrative walkthrough', 'tell the story of this code', 'Knuth-style documentation', 'weave a guide for this feature', or when they want deep, readable documentation that treats the program as literature. Also trigger when someone wants a document that a thoughtful reader could follow from start to finish and come away understanding both WHAT the code does and WHY every design choice was made.

3 1
Explore
petekp/agent-skills

autonomous-agent-readiness

Assess a codebase's readiness for autonomous agent development and provide tailored recommendations. Use when asked to evaluate how well a project supports unattended agent execution, assess development practices for agent autonomy, audit infrastructure for agent reliability, or improve a codebase for autonomous agent workflows. Triggers on requests like "assess this project for agent readiness", "how autonomous-ready is this codebase", "evaluate agent infrastructure", or "improve development practices for agents".

3 1
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results