Agent skill
mcaf-observability
Design or improve observability for application and delivery flows: logs, metrics, traces, correlation, alerts, and operational diagnostics. Use when a change affects runtime visibility, failure diagnosis, SLOs, or alerting.
Install this agent skill to your Project
npx add-skill https://github.com/managedcode/MCAF/tree/main/skills/mcaf-observability
SKILL.md
MCAF: Observability
Trigger On
- a change affects runtime visibility or failure diagnosis
- logs, metrics, traces, or alerts are missing or vague
- the team cannot answer "how will we know this broke?"
Value
- produce a concrete project delta: code, docs, config, tests, CI, or review artifact
- reduce ambiguity through explicit planning, verification, and final validation skills
- leave reusable project context so future tasks are faster and safer
Do Not Use For
- feature behaviour work with no runtime visibility impact
- generic monitoring talk with no concrete flow to instrument
Inputs
- the critical user or system flow under change
- current logs, metrics, traces, dashboards, and alerts
- operator expectations for diagnosis and response
Quick Start
- Read the nearest
AGENTS.mdand confirm scope and constraints. - Run this skill's
Workflowthrough theRalph Loopuntil outcomes are acceptable. - Return the
Required Result Formatwith concrete artifacts and verification evidence.
Workflow
- Identify the critical user or system flow that needs visibility.
- Define what must be observable:
- success and failure
- latency and throughput
- correlation across boundaries
- actionable alerting
- Treat observability as part of done, not an afterthought.
- Load only the references that match the affected runtime concern.
Deliver
- observability requirements for the changed flow
- updated logging, metrics, traces, or alerting guidance
- clear operator and engineer visibility expectations
Validate
- a failure can be detected and diagnosed from the chosen signals
- alerts are actionable, not noise
- cross-boundary correlation is possible where the flow needs it
- the observability plan matches user impact and operator needs
Ralph Loop
Use the Ralph Loop for every task, including docs, architecture, testing, and tooling work.
- Brainstorm first (mandatory):
- analyze current state
- define the problem, target outcome, constraints, and risks
- generate options and think through trade-offs before committing
- capture the recommended direction and open questions
- Plan second (mandatory):
- write a detailed execution plan from the chosen direction
- list final validation skills to run at the end, with order and reason
- Execute one planned step and produce a concrete delta.
- Review the result and capture findings with actionable next fixes.
- Apply fixes in small batches and rerun the relevant checks or review steps.
- Update the plan after each iteration.
- Repeat until outcomes are acceptable or only explicit exceptions remain.
- If a dependency is missing, bootstrap it or return
status: not_applicablewith explicit reason and fallback path.
Required Result Format
status:complete|clean|improved|configured|not_applicable|blockedplan: concise plan and current iteration stepactions_taken: concrete changes madevalidation_skills: final skills run, or skipped with reasonsverification: commands, checks, or review evidence summaryremaining: top unresolved items ornone
For setup-only requests with no execution, return status: configured and exact next commands.
Load References
- read
references/observability.mdfirst - open
references/alerting.md,references/best-practices.md,references/correlation-id.md,references/log-vs-metric-vs-trace.md, orreferences/pitfalls.mdonly when needed
Example Requests
- "Add observability requirements for this background worker."
- "We have logs but still cannot debug failures. Fix the plan."
- "Define alerts and traces for this API flow."
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
mcaf-architecture-overview
Create or update `docs/Architecture.md` as the global architecture map for a solution. Use when bootstrapping a repo, onboarding, or changing modules, boundaries, or contracts. Keep it navigational and use `references/overview-template.md` for scaffolding.
mcaf-human-review-planning
Plan a human review for a large AI-generated code drop by reading the target area, tracing the natural user and system flows, identifying the riskiest boundaries, and prioritizing the files a human should inspect first. Use when the codebase is too large to review line-by-line and you need a practical review sequence plus a prioritized file list.
mcaf-documentation
Create or refine durable engineering documentation: docs structure, navigation, source-of-truth placement, and writing quality. Use when a repo’s docs are missing, stale, duplicated, or hard to navigate, or when adding new durable engineering guidance.
mcaf-agile-delivery
Shape delivery workflow around backlog quality, roles, ceremonies, and engineering feedback. Use when defining how the team plans, tracks work, and turns feedback into durable improvements.
mcaf-solid-maintainability
Apply SOLID, SRP, cohesion, composition-over-inheritance, and small-file discipline to code changes. Use when refactoring large files or classes, setting maintainability limits in `AGENTS.md`, documenting justified exceptions, or reviewing design quality.
mcaf-ui-ux
Use UI/UX engineering guidance for design systems, accessibility, front-end technology selection, and design-to-development collaboration. Use when bootstrapping a UI project, choosing front-end stack, or tightening design and accessibility practices.
Didn't find tool you were looking for?