Agent skill
azure-troubleshoot
Troubleshoot Azure using tool-first access, falling back to Azure CLI when necessary. Focus on Virtual Machines, AKS, Azure Container Registry, Storage Accounts, and Log Analytics.
Stars
163
Forks
31
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/azure-troubleshoot
SKILL.md
Azure Troubleshooting Skill
General Guidance
Always use tool-based queries first to fetch logs, metrics, and diagnostic data.
Only fall back to Azure CLI for deeper or unsupported inspection.
Investigations should:
- Use Log Analytics Kusto queries with proper scoping
- Use Activity Logs to identify failures
- Use metrics when diagnosing performance issues
- Provide minimal, targeted remediation advice
Core Services Covered
Virtual Machines
Common issues:
- Boot failures
- OS/disk failures
- NIC/IP misconfiguration
Investigations:
- Inspect boot diagnostics logs
- Query
Heartbeattable for VM status - Check Activity Logs for failed start/stop operations
AKSS
Common issues:
- Pod scheduling failures
- Node pressure
- Image pull errors (ACR auth)
- Container crashes
Investigations:
- Query
KubeEvents - Query
KubePodInventory - Inspect
ContainerLog
Azure Container Registry (ACR))
Common issues:
- Permission denied (RBAC)
- Token expiration
Investigations:
- Query Activity Logs for
push/writeorpull/readfailures - Check repository event logs
Storage Accountss
Common issues:
- Firewall-restricted access
- SAS token expiration
- Object not found
Investigations:
- Query
StorageBlobLogs - Validate configuration + permissions
Log Analytics
Best practices:
- Always filter by
_ResourceId - Narrow time range
- Query only the tables relevant to the service
Workflow
- Identify target service
- Query Log Analytics with scoped KQL
- Query Activity Logs
- Review metrics
- Interpret patterns
- Recommend targeted fixes
Didn't find tool you were looking for?