Agent skills
kubernetes-troubleshoot

Agent skill

kubernetes-troubleshoot

Troubleshoot and manage Kubernetes clusters, including resource inspection, debugging, pod logs, events, and cluster operations. Use when the user needs to diagnose issues, inspect workloads, analyze pod failures, or perform Kubernetes cluster operations.

View SKILL.md on GitHub Repository

Stars 0

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/timbuchinger/loadout/tree/main/skills/kubernetes-troubleshoot

SKILL.md

Kubernetes Troubleshooting & Management Skill

What this Skill does

Use this skill when the user needs to troubleshoot or manage Kubernetes clusters. This includes operations such as:

Listing pods, deployments, namespaces, nodes
Getting pod logs
Fetching events for a resource
Inspecting workloads and resource conditions
Understanding CrashLoopBackOff, ImagePullBackOff, pending pods
Multi-cluster interactions through kubeconfig contexts
Suggesting next debugging steps

Tool Preference: MCP First, kubectl as Fallback

Preferred Method: Use the Kubernetes MCP server when available

MCP tools allow pre-approved operations for faster execution
More efficient for common read operations
Built-in safety guardrails

Fallback Method: Use kubectl commands via terminal when:

MCP server is not available or fails
MCP cannot provide the information needed
Specific kubectl features are required (port-forward, plugins, etc.)

This skill helps Claude:

Choose the appropriate tool based on availability and capabilities
Restrict queries to namespace/cluster automatically
Ask for confirmation before destructive actions
Recommend stepwise debugging strategies
Provide safe, context-efficient responses

Best Practices

1. Always scope operations

Include namespace unless user explicitly wants cluster-wide.
Include context when user has multiple clusters.
Encourage label selectors instead of listing all resources.

2. Prefer read-only operations first

Recommended sequence for debugging:

List pods matching a selector
Describe pod
Fetch pod events
Retrieve logs
Inspect configmaps/secrets/environment
Only then consider restart/delete/scale

4. Tool Usage Guidelines

When using MCP (preferred): Examples of safe MCP-driven operations:

List Pods: list pods --namespace=<ns> --context=<cluster>
Get Pod Logs: logs --namespace=<ns> --pod=<pod>
Get Events: get events --namespace=<ns> --field-selector=involvedObject.name=<name>
Inspect Deployment: get deployment <name> --namespace=<ns>

When using kubectl (fallback):

Always specify namespace with -n <namespace> or --namespace=<namespace>
Use --context when multiple clusters are configured
Consider using -o yaml or -o json for detailed inspection
Use kubectl explain for resource documentation
List Pods
- list pods --namespace=<ns> --context=<cluster>
Get Pod Logs
- logs --namespace=<ns> --pod=<pod>
Get Events
- get events --namespace=<ns> --field-selector=involvedObject.name=<name>
Inspect Deployment
- get deployment <name> --namespace=<ns>

5. Multi-Cluster Awareness

If multiple contexts exist:

Always request or infer the correct context.
Avoid ambiguous commands that default to the wrong cluster.

Example User Requests → Recommended Actions

User wants	Preferred (MCP)	Fallback (kubectl)
"List pods in frontend namespace"	`list pods --namespace=frontend`	`kubectl get pods -n frontend`
"Show me events for api-server"	`get events --namespace=<ns> --field-selector=involvedObject.name=api-server`	`kubectl get events -n <ns> --field-selector involvedObject.name=api-server`
"Get logs for db-0"	`logs --namespace=<ns> --pod=db-0`	`kubectl logs -n <ns> db-0`
"Why is pod web-123 CrashLooping?"	List pod → describe → events → logs (stepwise)	`kubectl describe pod -n <ns> web-123`, then logs
"Restart worker deployment"	Ask confirmation → delete pods or rollout restart if supported	`kubectl rollout restart deployment/<name> -n <ns>`

Tool Limitations

MCP Limitations:

Some advanced operations may not be available (port-forward, plugin commands)
Complex manifest edits may require kubectl fallback

kubectl Limitations:

Requires user approval for each command
Less efficient for multiple sequential operations
No pre-approval mechanisms) may not be available
Complex edits to manifests may require manual patching

Maintainer

timbuchinger Core maintainer

Source details

Full Name: timbuchinger/loadout
Branch: main
Path in repo: skills/kubernetes-troubleshoot

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

timbuchinger/loadout

brainstorming

Use when creating or developing, before writing code or implementation plans - refines rough ideas into fully-formed designs through collaborative questioning, alternative exploration, and incremental validation. Don't use during clear 'mechanical' processes

0 0

Explore

timbuchinger/loadout

add-note

Use this skill whenever important information is learned during a task or when the user explicitly asks to store something. Use when users ask to remember. Triggers on "remember this", "update memory", "share" or any persistent storage request.

0 0

Explore

timbuchinger/loadout

user-story

Creates well-structured user stories for software development and project management. Use when the user asks to write, create, or format a user story, or needs to document requirements, features, or tasks in user story format.

0 0

Explore

timbuchinger/loadout

test-driven-development

Use when implementing any feature or bugfix, before writing implementation code - write the test first, watch it fail, write minimal code to pass; ensures tests actually verify behavior by requiring failure first

0 0

Explore

timbuchinger/loadout

writing-plans

Use when design is complete and you need detailed implementation tasks - creates comprehensive implementation plans with exact file paths, complete code examples, and verification steps assuming minimal codebase familiarity

0 0

Explore

timbuchinger/loadout

1password

Manage personal secrets and passwords using 1Password CLI (op). Use when the user asks to query, retrieve, create, or manage secrets in 1Password, 1p, or op. This is for personal secrets only - not for cloud provider secret managers like Azure Key Vault, AWS Secrets Manager, or GCP Secret Manager.

0 0

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Kubernetes Troubleshooting & Management Skill

What this Skill does

Tool Preference: MCP First, kubectl as Fallback

Best Practices

1. Always scope operations

2. Prefer read-only operations first

4. Tool Usage Guidelines

5. Multi-Cluster Awareness

Example User Requests → Recommended Actions

Tool Limitations

Recommended Agent Skills

brainstorming

add-note

user-story

test-driven-development

writing-plans

1password