Agent skill
azure-troubleshoot
Troubleshoot Azure using tool-first access, falling back to Azure CLI when necessary. Focus on Virtual Machines, AKS, Azure Container Registry, Storage Accounts, and Log Analytics.
Install this agent skill to your Project
npx add-skill https://github.com/timbuchinger/loadout/tree/main/skills/azure-troubleshoot
SKILL.md
Azure Troubleshooting Skill
General Guidance
Always use tool-based queries first to fetch logs, metrics, and diagnostic data.
Only fall back to Azure CLI for deeper or unsupported inspection.
Investigations should:
- Use Log Analytics Kusto queries with proper scoping
- Use Activity Logs to identify failures
- Use metrics when diagnosing performance issues
- Provide minimal, targeted remediation advice
Core Services Covered
Virtual Machines
Common issues:
- Boot failures
- OS/disk failures
- NIC/IP misconfiguration
Investigations:
- Inspect boot diagnostics logs
- Query
Heartbeattable for VM status - Check Activity Logs for failed start/stop operations
AKSS
Common issues:
- Pod scheduling failures
- Node pressure
- Image pull errors (ACR auth)
- Container crashes
Investigations:
- Query
KubeEvents - Query
KubePodInventory - Inspect
ContainerLog
Azure Container Registry (ACR))
Common issues:
- Permission denied (RBAC)
- Token expiration
Investigations:
- Query Activity Logs for
push/writeorpull/readfailures - Check repository event logs
Storage Accountss
Common issues:
- Firewall-restricted access
- SAS token expiration
- Object not found
Investigations:
- Query
StorageBlobLogs - Validate configuration + permissions
Log Analytics
Best practices:
- Always filter by
_ResourceId - Narrow time range
- Query only the tables relevant to the service
Workflow
- Identify target service
- Query Log Analytics with scoped KQL
- Query Activity Logs
- Review metrics
- Interpret patterns
- Recommend targeted fixes
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
brainstorming
Use when creating or developing, before writing code or implementation plans - refines rough ideas into fully-formed designs through collaborative questioning, alternative exploration, and incremental validation. Don't use during clear 'mechanical' processes
add-note
Use this skill whenever important information is learned during a task or when the user explicitly asks to store something. Use when users ask to remember. Triggers on "remember this", "update memory", "share" or any persistent storage request.
user-story
Creates well-structured user stories for software development and project management. Use when the user asks to write, create, or format a user story, or needs to document requirements, features, or tasks in user story format.
test-driven-development
Use when implementing any feature or bugfix, before writing implementation code - write the test first, watch it fail, write minimal code to pass; ensures tests actually verify behavior by requiring failure first
kubernetes-troubleshoot
Troubleshoot and manage Kubernetes clusters, including resource inspection, debugging, pod logs, events, and cluster operations. Use when the user needs to diagnose issues, inspect workloads, analyze pod failures, or perform Kubernetes cluster operations.
writing-plans
Use when design is complete and you need detailed implementation tasks - creates comprehensive implementation plans with exact file paths, complete code examples, and verification steps assuming minimal codebase familiarity
Didn't find tool you were looking for?