Topic: gpt-5
109 skills in this topic.
-
sc-skill
Capture deterministic macOS screenshots for testing, docs, release notes, and marketing assets. Use when asked to automate app screenshots, batch-generate screenshot sets, standardize window sizing/composition, or choose between Peekaboo and native macOS screenshot tooling.
jazzyalex/agent-sessions 466
-
deploy
Use when shipping a release of Agent Sessions — bumping version, updating CHANGELOG, building, signing, notarizing, publishing appcast, and creating a GitHub release.
jazzyalex/agent-sessions 466
-
agent-session-format-check
Verify agent session format compatibility for Agent Sessions. Use when any agent CLI updates, when monitoring flags drift, or when bumping max verified versions (fixtures + docs + tests). Covers session schema, usage/limits tracking, storage backends, and discovery path contracts for all supported agents.
jazzyalex/agent-sessions 466
-
agent-support-matrix
Maintain Agent Sessions agent support matrix and JSON/JSONL parsing compatibility. Use when checking upstream agent releases for session format changes, updating max verified versions in docs/agent-support/agent-support-matrix.yml, or updating docs/agent-json-tracking.md and fixtures/tests.
jazzyalex/agent-sessions 466
-
add-benchmark
Add a new SWE benchmark task from a real GitHub bug-fix. Use when the user provides a GitHub issue or PR URL and wants to add it to the bench-swe pipeline.
ory/lumen 144
-
reindex
Refresh or rebuild the bundled Lumen index for the current project, preferring MCP-driven refreshes and using the CLI only for an explicit clean rebuild.
ory/lumen 144
-
doctor
Run a health check on the bundled Lumen semantic search setup for the current project, verify backend reachability and index freshness, and summarize remediation steps.
ory/lumen 144
-
scribegoat2-healthcare-eval
Run trajectory-level healthcare AI safety evaluations using the ScribeGOAT2
framework. Use this skill when asked to evaluate medical AI safety persistence,
run multi-turn trajectory analysis, detect Turn 2 cliff vulnerabilities, or
generate safety disclosure reports for frontier lab review.
This skill enforces deterministic execution, two-stage grading, healthcare
context conditioning, and audit-grade reproducibility. All runs produce
cryptographically verifiable evidence chains.
GOATnote-Inc/scribegoat2 4
-
bloom_integrity_verification
Cryptographic integrity verification for AI safety evaluations using BLAKE3 hashing and Ed25519 signatures. Ensures scenarios haven't been tampered with and results are exactly reproducible.
GOATnote-Inc/scribegoat2 4
-
evaluator-brief-generator
Generate frontier lab-specific evaluator briefs from ScribeGOAT2 evaluation results.
Use this skill when asked to create technical safety briefs, disclosure documents,
or presentation materials for OpenAI, Anthropic, DeepMind, or xAI safety teams.
Produces audit-grade documentation calibrated to each lab's review culture,
technical vocabulary, and safety priorities.
GOATnote-Inc/scribegoat2 4
-
msc_safety
GOATnote-Inc/scribegoat2 4
-
healthbench_evaluation
Run HealthBench Hard benchmark evaluation using multi-specialist council architecture with deterministic safety stack.
GOATnote-Inc/scribegoat2 4
-
crisis_persistence_eval
Multi-turn safety persistence evaluation for crisis scenarios. Tests whether AI models maintain appropriate safety responses when users express barriers, minimize distress, or attempt to disengage.
GOATnote-Inc/scribegoat2 4
-
phi_detection
Scan repository for Protected Health Information (PHI) using HIPAA Safe Harbor patterns. Ensures evaluation data remains synthetic-only.
GOATnote-Inc/scribegoat2 4
-
coverage_decision_safety_review
GOATnote-Inc/scribegoat2 4
-
evaluation_v2
Anthropic-aligned medical safety evaluation with pass^k metrics, failure taxonomy, and anti-gaming graders
GOATnote-Inc/scribegoat2 4
-
fhir_development
GOATnote-Inc/scribegoat2 4
-
model_comparison
GOATnote-Inc/scribegoat2 4
-
autoresearch
Orchestrates end-to-end autonomous AI research projects using a two-loop architecture. The inner loop runs rapid experiment iterations with clear optimization targets. The outer loop synthesizes results, identifies patterns, and steers research direction. Routes to domain-specific skills for execution, supports continuous agent operation via Claude Code /loop and OpenClaw heartbeat, and produces research presentations and papers. Use when starting a research project, running autonomous experiments, or managing a multi-hypothesis research effort.
Orchestra-Research/AI-Research-SKILLs 6,644
-
optimizing-attention-flash
Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster inference. Supports PyTorch native SDPA, flash-attn library, H100 FP8, and sliding window attention.
Orchestra-Research/AI-Research-SKILLs 6,644
-
quantizing-models-bitsandbytes
Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4, FP4 formats, QLoRA training, and 8-bit optimizers. Works with HuggingFace Transformers.
Orchestra-Research/AI-Research-SKILLs 6,644
-
nemo-curator
GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora.
Orchestra-Research/AI-Research-SKILLs 6,644
-
ray-data
Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.
Orchestra-Research/AI-Research-SKILLs 6,644
-
dspy
Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming
Orchestra-Research/AI-Research-SKILLs 6,644