Agent skill
codex
Use when the user asks to run Codex CLI (codex exec, codex resume) or references OpenAI Codex for code analysis, refactoring, or automated editing
Install this agent skill to your Project
npx add-skill https://github.com/skills-directory/skill-codex/tree/main/plugins/skill-codex/skills/codex
SKILL.md
Codex Skill Guide
Running a Task
- Ask the user (via
AskUserQuestion) which model to run (gpt-5.4,gpt-5.3-codex-spark, orgpt-5.3-codex) AND which reasoning effort to use (xhigh,high,medium, orlow) in a single prompt with two questions. - Select the sandbox mode required for the task; default to
--sandbox read-onlyunless edits or network access are necessary. - Assemble the command with the appropriate options:
-m, --model <MODEL>--config model_reasoning_effort="<xhigh|high|medium|low>"--sandbox <read-only|workspace-write|danger-full-access>--full-auto-C, --cd <DIR>--skip-git-repo-check"your prompt here"(as final positional argument)
- Always use --skip-git-repo-check.
- When continuing a previous session, use
codex exec --skip-git-repo-check resume --lastvia stdin. When resuming don't use any configuration flags unless explicitly requested by the user e.g. if he species the model or the reasoning effort when requesting to resume a session. Resume syntax:echo "your prompt here" | codex exec --skip-git-repo-check resume --last 2>/dev/null. All flags have to be inserted between exec and resume. - IMPORTANT: By default, append
2>/dev/nullto allcodex execcommands to suppress thinking tokens (stderr). Only show stderr if the user explicitly requests to see thinking tokens or if debugging is needed. - Run the command, capture stdout/stderr (filtered as appropriate), and summarize the outcome for the user.
- After Codex completes, inform the user: "You can resume this Codex session at any time by saying 'codex resume' or asking me to continue with additional analysis or changes."
Quick Reference
| Use case | Sandbox mode | Key flags |
|---|---|---|
| Read-only review or analysis | read-only |
--sandbox read-only 2>/dev/null |
| Apply local edits | workspace-write |
--sandbox workspace-write --full-auto 2>/dev/null |
| Permit network or broad access | danger-full-access |
--sandbox danger-full-access --full-auto 2>/dev/null |
| Resume recent session | Inherited from original | echo "prompt" | codex exec --skip-git-repo-check resume --last 2>/dev/null (no flags allowed) |
| Run from another directory | Match task needs | -C <DIR> plus other flags 2>/dev/null |
Following Up
- After every
codexcommand, immediately useAskUserQuestionto confirm next steps, collect clarifications, or decide whether to resume withcodex exec resume --last. - When resuming, pipe the new prompt via stdin:
echo "new prompt" | codex exec resume --last 2>/dev/null. The resumed session automatically uses the same model, reasoning effort, and sandbox mode from the original session. - Restate the chosen model, reasoning effort, and sandbox mode when proposing follow-up actions.
Critical Evaluation of Codex Output
Codex is powered by OpenAI models with their own knowledge cutoffs and limitations. Treat Codex as a colleague, not an authority.
Guidelines
- Trust your own knowledge when confident. If Codex claims something you know is incorrect, push back directly.
- Research disagreements using WebSearch or documentation before accepting Codex's claims. Share findings with Codex via resume if needed.
- Remember knowledge cutoffs - Codex may not know about recent releases, APIs, or changes that occurred after its training data.
- Don't defer blindly - Codex can be wrong. Evaluate its suggestions critically, especially regarding:
- Model names and capabilities
- Recent library versions or API changes
- Best practices that may have evolved
When Codex is Wrong
- State your disagreement clearly to the user
- Provide evidence (your own knowledge, web search, docs)
- Optionally resume the Codex session to discuss the disagreement. Identify yourself as Claude so Codex knows it's a peer AI discussion. Use your actual model name (e.g., the model you are currently running as) instead of a hardcoded name:
bash
echo "This is Claude (<your current model name>) following up. I disagree with [X] because [evidence]. What's your take on this?" | codex exec --skip-git-repo-check resume --last 2>/dev/null - Frame disagreements as discussions, not corrections - either AI could be wrong
- Let the user decide how to proceed if there's genuine ambiguity
Error Handling
- Stop and report failures whenever
codex --versionor acodex execcommand exits non-zero; request direction before retrying. - Before you use high-impact flags (
--full-auto,--sandbox danger-full-access,--skip-git-repo-check) ask the user for permission using AskUserQuestion unless it was already given. - When output includes warnings or partial results, summarize them and ask how to adjust using
AskUserQuestion.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
verl-rl-training
Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.
openrlhf-training
High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models (7B-70B+). Built on Ray, vLLM, ZeRO-3. 2× faster than DeepSpeedChat with distributed architecture and GPU resource sharing.
gguf-quantization
GGUF format and llama.cpp quantization for efficient CPU/GPU inference. Use when deploying models on consumer hardware, Apple Silicon, or when needing flexible quantization from 2-8 bit without GPU requirements.
Claude Code Guide
Master guide for using Claude Code effectively. Includes configuration templates, prompting strategies "Thinking" keywords, debugging techniques, and best practices for interacting with the agent.
qdrant-vector-search
High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or scalable vector storage with Rust-powered performance.
behavioral-modes
AI operational modes (brainstorm, implement, debug, review, teach, ship, orchestrate). Use to adapt behavior based on task type.
Didn't find tool you were looking for?