Agent skill
gpu-monitor
Check GPU status, running experiments, and available resources
Install this agent skill to your Project
npx add-skill https://github.com/Xiangyue-Zhang/auto-deep-researcher-24x7/tree/main/skills/gpu-monitor
SKILL.md
/gpu-monitor
Quick GPU status check for experiment management.
Usage
/gpu-monitor
/gpu-monitor --server user@remote-host
Behavior
- Run
nvidia-smito get current GPU status - Display a clean summary table:
- GPU ID, Name, Memory (used/total), Utilization %, Temperature
- Running processes on each GPU
- Identify which GPUs are free (< 1GB memory used)
- Identify which GPUs are running experiments (check for python/torchrun processes)
- If
--serveris provided, SSH to remote server first
Output Format
GPU Status
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
GPU Name Memory Util Temp
0 L20X 144GB 45123/147456 98% 72°C ← training (PID 12345)
1 L20X 144GB 234/147456 0% 35°C ← FREE
2 L20X 144GB 43210/147456 95% 70°C ← training (PID 12346)
3 L20X 144GB 1024/147456 12% 40°C ← keeper
Free GPUs: [1]
Training: GPU 0 (PID 12345), GPU 2 (PID 12346)
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
experiment-status
Check status of running autonomous experiment loops
conf-search
Search papers from top AI/ML conferences
auto-experiment
Launch an autonomous THINK→EXECUTE→REFLECT experiment loop on a GPU project
obsidian-sync
Refresh Obsidian dashboard and daily notes from current experiment state
progress-report
Generate structured research progress reports
daily-papers
Daily arXiv paper recommendations with automatic deduplication
Didn't find tool you were looking for?