k8s-troubleshoot

Debug Kubernetes pods, services, and cluster issues. Use when the user says "pod not starting", "CrashLoopBackOff", "service not reachable", "kubectl debug", "pod stuck pending", or asks about Kubernetes problems.

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/devops/k8s-troubleshoot-mhalder-dotfiles

SKILL.md

Kubernetes Troubleshoot

Debug pods, services, deployments, and networking issues in Kubernetes.

Instructions

Identify the affected resource (pod, service, deployment)
Get current state with kubectl get and kubectl describe
Check logs if applicable
Diagnose based on status/events
Provide specific remediation steps

Diagnostic commands

bash

# Pod debugging
kubectl get pods -o wide
kubectl describe pod <pod>
kubectl logs <pod> [--previous] [-c container]
kubectl get events --sort-by=.lastTimestamp

# Service/networking
kubectl get svc,endpoints
kubectl describe svc <service>
kubectl get ingress

# Resource issues
kubectl top pods
kubectl describe node <node> | grep -A5 "Allocated resources"

# Debug pod (ephemeral container)
kubectl debug -it <pod> --image=busybox --target=<container>

Common issues

Status	Cause	Solution
Pending	No resources	Check node capacity, resource requests
Pending	No matching node	Check nodeSelector, taints/tolerations
ImagePullBackOff	Bad image/auth	Verify image name, imagePullSecrets
CrashLoopBackOff	App crashing	Check logs, entrypoint, health probes
CreateContainerConfigError	Bad configmap/secret	Verify referenced configs exist
Evicted	Node pressure	Check node conditions, resource limits

Service not reachable checklist

Pod running? kubectl get pods -l app=<app>
Pod ready? Check readiness probe
Endpoints exist? kubectl get endpoints <svc>
Service selector matches pod labels?
Port/targetPort correct?
NetworkPolicy blocking traffic?

Rules

MUST check events with kubectl describe before diagnosing
MUST check logs for CrashLoopBackOff
Never delete pods/resources without user approval
Never apply changes without showing the diff first
Always specify namespace if not default: -n <namespace>