Agent skill
examples-auto-run
Run python examples in auto mode with logging, rerun helpers, and background control.
Install this agent skill to your Project
npx add-skill https://github.com/openai/openai-agents-python/tree/main/.agents/skills/examples-auto-run
SKILL.md
examples-auto-run
What it does
- Runs
uv run examples/run_examples.pywith:EXAMPLES_INTERACTIVE_MODE=auto(auto-input/auto-approve).- Per-example logs under
.tmp/examples-start-logs/. - Main summary log path passed via
--main-log(also under.tmp/examples-start-logs/). - Generates a rerun list of failures at
.tmp/examples-rerun.txtwhen--write-rerunis set.
- Provides start/stop/status/logs/tail/collect/rerun helpers via
run.sh. - Background option keeps the process running with a pidfile;
stopcleans it up.
Usage
# Start (auto mode; interactive included by default)
.agents/skills/examples-auto-run/scripts/run.sh start [extra args to run_examples.py]
# Examples:
.agents/skills/examples-auto-run/scripts/run.sh start --filter basic
.agents/skills/examples-auto-run/scripts/run.sh start --include-server --include-audio
# Check status
.agents/skills/examples-auto-run/scripts/run.sh status
# Stop running job
.agents/skills/examples-auto-run/scripts/run.sh stop
# List logs
.agents/skills/examples-auto-run/scripts/run.sh logs
# Tail latest log (or specify one)
.agents/skills/examples-auto-run/scripts/run.sh tail
.agents/skills/examples-auto-run/scripts/run.sh tail main_20260113-123000.log
# Collect rerun list from a main log (defaults to latest main_*.log)
.agents/skills/examples-auto-run/scripts/run.sh collect
# Rerun only failed entries from rerun file (auto mode)
.agents/skills/examples-auto-run/scripts/run.sh rerun
Defaults (overridable via env)
EXAMPLES_INTERACTIVE_MODE=autoEXAMPLES_INCLUDE_INTERACTIVE=1EXAMPLES_INCLUDE_SERVER=0EXAMPLES_INCLUDE_AUDIO=0EXAMPLES_INCLUDE_EXTERNAL=0- Auto-approvals in auto mode:
APPLY_PATCH_AUTO_APPROVE=1,SHELL_AUTO_APPROVE=1,AUTO_APPROVE_MCP=1
Log locations
- Main logs:
.tmp/examples-start-logs/main_*.log - Per-example logs (from
run_examples.py):.tmp/examples-start-logs/<module_path>.log - Rerun list:
.tmp/examples-rerun.txt - Stdout logs:
.tmp/examples-start-logs/stdout_*.log
Notes
- The runner delegates to
uv run examples/run_examples.py, which already writes per-example logs and supports--collect,--rerun-file, and--print-auto-skip. startuses--write-rerunso failures are captured automatically.- If
.tmp/examples-rerun.txtexists and is non-empty, invoking the skill with no args runsrerunby default.
Behavioral validation (Codex/LLM responsibility)
The runner does not perform any automated behavioral validation. After every foreground start or rerun, Codex must manually validate all exit-0 entries:
- Read the example source (and comments) to infer intended flow, tools used, and expected key outputs.
- Open the matching per-example log under
.tmp/examples-start-logs/. - Confirm the intended actions/results occurred; flag omissions or divergences.
- Do this for all passed examples, not just a sample.
- Report immediately after the run with concise citations to the exact log lines that justify the validation.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
openai-knowledge
Use when working with the OpenAI API (Responses API) or OpenAI platform features (tools, streaming, Realtime API, auth, models, rate limits, MCP) and you need authoritative, up-to-date documentation (schemas, examples, limits, edge cases). Prefer the OpenAI Developer Documentation MCP server tools when available; otherwise guide the user to enable `openaiDeveloperDocs`.
final-release-review
Perform a release-readiness review by locating the previous release tag from remote tags and auditing the diff (e.g., v1.2.3...<commit>) for breaking changes, regressions, improvement opportunities, and risks before releasing openai-agents-python.
implementation-strategy
Decide how to implement runtime and API changes in openai-agents-python before editing code. Use when a task changes exported APIs, runtime behavior, serialized state, tests, or docs and you need to choose the compatibility boundary, whether shims or migrations are warranted, and when unreleased interfaces can be rewritten directly.
docs-sync
Analyze main branch implementation and configuration to find missing, incorrect, or outdated documentation in docs/. Use when asked to audit doc coverage, sync docs with code, or propose doc updates/structure changes. Only update English docs under docs/** and never touch translated docs under docs/ja, docs/ko, or docs/zh. Provide a report and ask for approval before editing docs.
runtime-behavior-probe
Plan and execute runtime-behavior investigations with temporary probe scripts, validation matrices, state controls, and findings-first reports. Use only when the user explicitly invokes this skill to verify actual runtime behavior beyond normal code-level checks, especially to uncover edge cases, undocumented behavior, or common failure modes in local or live integrations. A baseline smoke check is fine as an entry point, but do not stop at happy-path confirmation.
test-coverage-improver
Improve test coverage in the OpenAI Agents Python repository: run `make coverage`, inspect coverage artifacts, identify low-coverage files, propose high-impact tests, and confirm with the user before writing tests.
Didn't find tool you were looking for?