Agent skill
observability-audit
Audit code for observability gaps — debug logs left in, errors caught without being logged, missing context on log entries, untracked slow operations. Uses the app's existing observability tooling exclusively.
Install this agent skill to your Project
npx add-skill https://github.com/markmdev/meridian/tree/main/skills/observability-audit
SKILL.md
Observability Audit
Code that works locally but is impossible to debug in production. This skill finds and fixes observability gaps using whatever tools the app already has.
Step 0: Research existing observability tooling
Before anything else, explore the codebase to understand what's already in use:
- Error tracking: Sentry, Bugsnag, Rollbar, or similar?
- Logging: structured logger (Pino, Winston, Bunyan), cloud provider SDK, custom wrapper?
- APM / metrics: Datadog, New Relic, OpenTelemetry?
- Analytics: PostHog, Segment, Amplitude?
- Any custom
loggerortelemetryutilities?
Read how they're configured and how they're used in existing code. All fixes must use these — never introduce a new observability dependency or pattern.
What to look for
Debug artifacts left in production code:
console.log,console.debug,console.infothat aren't part of the established logging pattern- Temporary logging added during debugging and never removed
- Commented-out log statements
Errors that disappear:
catch(e)that propagates or re-throws without logging first — the error reaches the user but leaves no trace for debugging- Errors logged with no context:
logger.error(e)alone, with no info about what operation failed, what inputs were involved, or what the user was doing - Errors tracked to the error tracker but without relevant metadata (user ID, request ID, relevant state)
Missing context on log entries:
- Logs that say what happened but not enough to reproduce it (no entity IDs, no relevant parameters)
- No correlation/request ID on logs in request-handling code
- Log entries that can't be connected to a specific user or session when that would be needed for debugging
Untracked slow or critical operations:
- External API calls with no timing logged on failure or when slow
- Database queries with no observability when they're critical path
- Background jobs or queues with no start/complete/fail tracking
Process
- Research existing tooling (Step 0) — do not skip this
- Identify the scope from the user's request
- Find every instance of the anti-patterns above
- Fix using the existing tooling and patterns
- Remove debug artifacts, add context to thin logs, add tracking where missing
- Report changes
Fix principles
- Every caught error should be logged with enough context to reproduce the problem
- Use the existing logger/tracker — never introduce a second one
- Debug
console.loggoes away entirely — no conversion to structured log, just deleted - Log context should include: what operation, what failed, relevant IDs (user, entity, request)
- Don't add logging to every function — focus on boundaries (external calls, queue handlers, critical paths)
Reference files
references/observability-patterns.md— Detection patterns, bad/fix examples for debug artifacts, missing logging, missing context, untracked operations. Read before starting the audit.
Report
Summarize by file: what was removed, what was added or improved, what context was missing and is now included.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
create-docs
Create or update .meridian/docs/ knowledge files for a module or directory. Produces reference docs with frontmatter for context routing.
planning
Interview-driven planning methodology that produces implementation-ready plans. Always use this skill INSTEAD of EnterPlanMode — it provides structured interviewing (20-40 clarifying questions), exhaustive parallel codebase exploration (5-15 Explore agents), verbatim requirements capture, and automated plan validation via plan-reviewer (must score 9+). Use for new features, refactoring, architecture changes, migrations, or any non-trivial implementation work.
ux-states-audit
Audit UI code for missing loading states, empty states, and error states. Every async operation and data-driven UI must handle all three. Finds gaps and implements the missing states using the app's existing patterns.
error-audit
Audit code for silent error swallowing, fallbacks to degraded alternatives, backwards compatibility shims, and UI that fails to show errors to the user. Finds and fixes all occurrences in the specified scope.
add-frontmatter
Scan all .md files in the project and add or fix YAML frontmatter (summary + read_when) so they can be discovered by context routers like Reflex.
verl-rl-training
Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.
Didn't find tool you were looking for?