Agent skill

storage-debug-instrumentation

Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.

Stars 232
Forks 15

Install this agent skill to your Project

npx add-skill https://github.com/aiskillstore/marketplace/tree/main/skills/benderfendor/storage-debug-instrumentation

SKILL.md

Storage Debug Instrumentation

Purpose

Enable rapid diagnosis of storage state, synchronization health, and backend performance bottlenecks by exposing:

  • Raw article inspection from both PostgreSQL and ChromaDB
  • Storage drift detection (missing/dangling entries)
  • Detailed startup timeline breakdown (DB init, cache preload, vector store, RSS refresh)
  • One-page debug dashboard consolidating all diagnostics

Scope

  • Backend: app/services/startup_metrics.py, app/main.py, app/vector_store.py, app/database.py, app/api/routes/debug.py
  • Frontend: frontend/lib/api.ts, frontend/app/debug/page.tsx
  • No schema changes; purely additive instrumentation and debug routes

Workflow

1. Create startup metrics service

File: backend/app/services/startup_metrics.py

  • Implement thread-safe StartupMetrics class to record phase timings
  • Expose record_event(name, started_at, detail, metadata) for phase capture
  • Support add_note(key, value) for arbitrary annotations
  • Export singleton startup_metrics for app-wide use

2. Instrument vector store initialization

File: backend/app/vector_store.py

  • Import startup_metrics
  • In VectorStore.__init__(), wrap initialization with time.time() timer
  • Record event with metadata: host, port, collection, documents
  • Catch connection errors and annotate them

3. Instrument FastAPI startup sequence

File: backend/app/main.py

  • Call startup_metrics.mark_app_started() at beginning of on_startup()
  • Wrap each phase (DB init, schedulers, cache preload, RSS refresh, migration) with record_event()
  • Include metadata: cache_size, article_count, oldest_article_hours
  • Call startup_metrics.mark_app_completed() at end
  • Add app version notes via add_note()

4. Add database pagination helpers

File: backend/app/database.py

  • Implement fetch_articles_page() to support:
    • Limit/offset pagination
    • Optional source filter
    • Missing-embeddings-only flag
    • Published date range filters
    • Sort direction (asc/desc)
    • Return oldest/newest timestamp bounds
  • Implement fetch_article_chroma_mappings() to return all article→chroma ID mappings for drift analysis

5. Add vector store pagination helpers

File: backend/app/vector_store.py

  • Implement list_articles(limit, offset) to return paginated Chroma documents with metadata and previews
  • Implement list_all_ids() to return all stored Chroma IDs for drift detection (used by /debug/storage/drift)

6. Expose debug API endpoints

File: backend/app/api/routes/debug.py

  • Add GET /debug/startup → returns startup metrics timeline (events + notes)
  • Add GET /debug/chromadb/articles → returns paginated raw Chroma entries with limit/offset
  • Add GET /debug/database/articles → returns paginated Postgres rows with filters (source, embeddings, date range, sort)
  • Add GET /debug/storage/drift → compares Chroma IDs vs Postgres mappings, returns missing/dangling counts + samples

7. Add frontend API bindings

File: frontend/lib/api.ts

  • Export types: StartupEventMetric, StartupMetricsResponse, ChromaDebugResponse, DatabaseDebugResponse, StorageDriftReport
  • Export fetchers: fetchStartupMetrics(), fetchChromaDebugArticles(), fetchDatabaseDebugArticles(), fetchStorageDrift()
  • Ensure snake_case→camelCase mapping for response fields

8. Build debug dashboard page

File: frontend/app/debug/page.tsx

  • Create /debug route with multi-tab inspection UI
  • Render startup timeline: phase name, duration, metadata badges (cache size, vectors, migrated records)
  • Display Chroma browser: paginated table with ID, title, source, preview
  • Display Postgres browser: paginated table with filters (source, date range, missing-embeddings-only flag)
  • Display drift report: sample tables for missing-in-chroma and dangling-in-chroma entries
  • Include summary cards for quick metrics (boot time, total articles, vector count, drift count)

Implementation checklist

  • Create backend/app/services/startup_metrics.py
  • Instrument backend/app/vector_store.py::VectorStore.__init__()
  • Instrument backend/app/main.py::on_startup() (all phases)
  • Add fetch_articles_page() and fetch_article_chroma_mappings() to backend/app/database.py
  • Add list_articles() and list_all_ids() to backend/app/vector_store.py
  • Add /debug/startup, /debug/chromadb/articles, /debug/database/articles, /debug/storage/drift to backend/app/api/routes/debug.py
  • Add types and fetchers to frontend/lib/api.ts
  • Create frontend/app/debug/page.tsx with dashboard layout
  • Run uvx ruff check backend → all checks pass
  • Test endpoints in curl or Postman to verify response structure

Verification checklist

  • GET http://localhost:8000/debug/startup returns valid timeline with events and notes
  • GET http://localhost:8000/debug/chromadb/articles?limit=50&offset=0 returns paginated Chroma docs
  • GET http://localhost:8000/debug/database/articles?source=bbc&missing_embeddings_only=false filters correctly
  • GET http://localhost:8000/debug/storage/drift compares counts and returns drift samples
  • http://localhost:3000/debug loads without errors and displays all four sections
  • Refresh button triggers all four API calls in parallel
  • Pagination controls update limit/offset correctly
  • Database filters (source, date range) update and refresh data
  • Startup timeline shows non-zero phase durations if backend just started

Future enhancements

  • Streaming startup metrics via SSE (live tail during boot)
  • Export startup report as JSON/CSV for performance tracking over time
  • Automated drift alerts (post to Slack/email if dangling > threshold)
  • Performance graphs (startup time trends, article throughput)
  • Sync-on-demand action (button to force vector store refresh for missing articles)

Expand your agent's capabilities with these related and highly-rated skills.

aiskillstore/marketplace

perigon-backend

Perigon ASP.NET Core + EF Core + Aspire conventions

232 15
Explore
aiskillstore/marketplace

perigon-agent

Pointers for Copilot/agents to apply Perigon conventions

232 15
Explore
aiskillstore/marketplace

perigon-angular

Angular 21+ standalone/Material/signal conventions for Perigon WebApp

232 15
Explore
aiskillstore/marketplace

fastapi-mastery

Comprehensive FastAPI development skill covering REST API creation, routing, request/response handling, validation, authentication, database integration, middleware, and deployment. Use when working with FastAPI projects, building APIs, implementing CRUD operations, setting up authentication/authorization, integrating databases (SQL/NoSQL), adding middleware, handling WebSockets, or deploying FastAPI applications. Triggered by requests involving .py files with FastAPI code, API endpoint creation, Pydantic models, or FastAPI-specific features.

232 15
Explore
aiskillstore/marketplace

context7-efficient

Token-efficient library documentation fetcher using Context7 MCP with 86.8% token savings through intelligent shell pipeline filtering. Fetches code examples, API references, and best practices for JavaScript, Python, Go, Rust, and other libraries. Use when users ask about library documentation, need code examples, want API usage patterns, are learning a new framework, need syntax reference, or troubleshooting with library-specific information. Triggers include questions like "Show me React hooks", "How do I use Prisma", "What's the Next.js routing syntax", or any request for library/framework documentation.

232 15
Explore
aiskillstore/marketplace

browser-use

Browser automation using Playwright MCP. Navigate websites, fill forms, click elements, take screenshots, and extract data. Use when tasks require web browsing, form submission, web scraping, UI testing, or any browser interaction.

232 15
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results