Agent skill
backend-hang-debug
Diagnose and fix FastAPI hangs caused by blocking ThreadPoolExecutor shutdown in the news stream route; includes py-spy capture and non-blocking executor pattern.
Install this agent skill to your Project
npx add-skill https://github.com/aiskillstore/marketplace/tree/main/skills/benderfendor/backend-hang-debug
SKILL.md
Backend Hang Debug
Purpose
- Detect and resolve event-loop hangs where the FastAPI app stops responding (e.g.,
curl http://localhost:8000/times out) due to synchronous executor shutdown in the SSE news stream. - Provide a repeatable triage flow using
py-spyto capture live stacks and pinpoint blocking code.
Scope
- Backend:
backend/app/api/routes/stream.py(news stream),backend/app/services/rss_ingestion.py(RSS workers), startup processes. - Tooling:
py-spyfor live stack dumps;curlwith timeouts for smoke tests.
Quick Triage
- Reproduce hang:
curl -m 5 http://localhost:8000/andcurl -m 5 http://localhost:8000/health; note timeouts. - Process check:
ss -tlnp | grep 8000to confirm listener;ls /proc/$(pgrep -f "uvicorn app.main")/fd | wc -lto rule out FD leak. - Stack capture (inside backend venv):
uv pip install py-spythensudo /home/bender/classwork/Thesis/backend/.venv/bin/py-spy dump --pid $(pgrep -f "uvicorn app.main")(and worker pid if multiprocess). Look forThreadPoolExecutor.shutdowninapi/routes/stream.pyframes.
Fix Pattern (non-blocking executor)
- Replace synchronous context manager
with ThreadPoolExecutor(...):insideevent_generatorwith a long-lived executor plus explicit non-blocking shutdown:- Create executor outside the context manager.
- On client disconnect, cancel pending futures instead of awaiting shutdown.
- In
finally, callexecutor.shutdown(wait=False, cancel_futures=True).
- Rationale: context manager calls
shutdown(wait=True), blocking the event loop if RSS worker threads hang on network I/O.
Implementation Steps
- Update stream executor usage in
backend/app/api/routes/stream.py:- Instantiate
executor = concurrent.futures.ThreadPoolExecutor(max_workers=5). - Dispatch work via
loop.run_in_executor(executor, _process_source_with_debug, ...). - On disconnect,
cancel()pending futures. - In
finally,executor.shutdown(wait=False, cancel_futures=True).
- Instantiate
- Keep RSS executor as-is (
rss_ingestion.py) since it runs in background threads, but ensure request timeouts remain reasonable (currently 60s per RSSrequests.get). - Retest:
- Restart uvicorn;
curl -m 5 http://localhost:8000/healthshould respond. - Start a stream request and abort the client; server must stay responsive.
- Re-run
py-spy dumpto verify noThreadPoolExecutor.shutdown(wait=True)frames in main thread.
- Restart uvicorn;
Verification Checklist
-
curl -m 5 http://localhost:8000/returns a response (no hang). -
curl -m 5 http://localhost:8000/healthsucceeds. - Aborting
/news/streamdoes not freeze subsequent requests. -
py-spy dumpshows event loop not blocked onThreadPoolExecutor.shutdown. - Frontend no longer stalls waiting on root/health while backend is busy with streams.
Notes & Future Hardening
- Consider adding request timeout middleware to fail fast on slow handlers.
- Add per-source network timeouts and shorter retries for RSS feeds to reduce long-lived threads.
- If multi-worker uvicorn is used, run
py-spyon each worker pid when diagnosing hangs.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
perigon-backend
Perigon ASP.NET Core + EF Core + Aspire conventions
perigon-agent
Pointers for Copilot/agents to apply Perigon conventions
perigon-angular
Angular 21+ standalone/Material/signal conventions for Perigon WebApp
fastapi-mastery
Comprehensive FastAPI development skill covering REST API creation, routing, request/response handling, validation, authentication, database integration, middleware, and deployment. Use when working with FastAPI projects, building APIs, implementing CRUD operations, setting up authentication/authorization, integrating databases (SQL/NoSQL), adding middleware, handling WebSockets, or deploying FastAPI applications. Triggered by requests involving .py files with FastAPI code, API endpoint creation, Pydantic models, or FastAPI-specific features.
context7-efficient
Token-efficient library documentation fetcher using Context7 MCP with 86.8% token savings through intelligent shell pipeline filtering. Fetches code examples, API references, and best practices for JavaScript, Python, Go, Rust, and other libraries. Use when users ask about library documentation, need code examples, want API usage patterns, are learning a new framework, need syntax reference, or troubleshooting with library-specific information. Triggers include questions like "Show me React hooks", "How do I use Prisma", "What's the Next.js routing syntax", or any request for library/framework documentation.
browser-use
Browser automation using Playwright MCP. Navigate websites, fill forms, click elements, take screenshots, and extract data. Use when tasks require web browsing, form submission, web scraping, UI testing, or any browser interaction.
Didn't find tool you were looking for?