Topic: gpt-5

109 skills in this topic.

sc-skill Capture deterministic macOS screenshots for testing, docs, release notes, and marketing assets. Use when asked to automate app screenshots, batch-generate screenshot sets, standardize window sizing/composition, or choose between Peekaboo and native macOS screenshot tooling.
jazzyalex/agent-sessions 466
deploy Use when shipping a release of Agent Sessions — bumping version, updating CHANGELOG, building, signing, notarizing, publishing appcast, and creating a GitHub release.
jazzyalex/agent-sessions 466
agent-session-format-check Verify agent session format compatibility for Agent Sessions. Use when any agent CLI updates, when monitoring flags drift, or when bumping max verified versions (fixtures + docs + tests). Covers session schema, usage/limits tracking, storage backends, and discovery path contracts for all supported agents.
jazzyalex/agent-sessions 466
agent-support-matrix Maintain Agent Sessions agent support matrix and JSON/JSONL parsing compatibility. Use when checking upstream agent releases for session format changes, updating max verified versions in docs/agent-support/agent-support-matrix.yml, or updating docs/agent-json-tracking.md and fixtures/tests.
jazzyalex/agent-sessions 466
add-benchmark Add a new SWE benchmark task from a real GitHub bug-fix. Use when the user provides a GitHub issue or PR URL and wants to add it to the bench-swe pipeline.
ory/lumen 144
reindex Refresh or rebuild the bundled Lumen index for the current project, preferring MCP-driven refreshes and using the CLI only for an explicit clean rebuild.
ory/lumen 144
doctor Run a health check on the bundled Lumen semantic search setup for the current project, verify backend reachability and index freshness, and summarize remediation steps.
ory/lumen 144
scribegoat2-healthcare-eval Run trajectory-level healthcare AI safety evaluations using the ScribeGOAT2 framework. Use this skill when asked to evaluate medical AI safety persistence, run multi-turn trajectory analysis, detect Turn 2 cliff vulnerabilities, or generate safety disclosure reports for frontier lab review. This skill enforces deterministic execution, two-stage grading, healthcare context conditioning, and audit-grade reproducibility. All runs produce cryptographically verifiable evidence chains.
GOATnote-Inc/scribegoat2 4
bloom_integrity_verification Cryptographic integrity verification for AI safety evaluations using BLAKE3 hashing and Ed25519 signatures. Ensures scenarios haven't been tampered with and results are exactly reproducible.
GOATnote-Inc/scribegoat2 4
evaluator-brief-generator Generate frontier lab-specific evaluator briefs from ScribeGOAT2 evaluation results. Use this skill when asked to create technical safety briefs, disclosure documents, or presentation materials for OpenAI, Anthropic, DeepMind, or xAI safety teams. Produces audit-grade documentation calibrated to each lab's review culture, technical vocabulary, and safety priorities.
GOATnote-Inc/scribegoat2 4
msc_safety
GOATnote-Inc/scribegoat2 4
healthbench_evaluation Run HealthBench Hard benchmark evaluation using multi-specialist council architecture with deterministic safety stack.
GOATnote-Inc/scribegoat2 4
crisis_persistence_eval Multi-turn safety persistence evaluation for crisis scenarios. Tests whether AI models maintain appropriate safety responses when users express barriers, minimize distress, or attempt to disengage.
GOATnote-Inc/scribegoat2 4
phi_detection Scan repository for Protected Health Information (PHI) using HIPAA Safe Harbor patterns. Ensures evaluation data remains synthetic-only.
GOATnote-Inc/scribegoat2 4
coverage_decision_safety_review
GOATnote-Inc/scribegoat2 4
evaluation_v2 Anthropic-aligned medical safety evaluation with pass^k metrics, failure taxonomy, and anti-gaming graders
GOATnote-Inc/scribegoat2 4
fhir_development
GOATnote-Inc/scribegoat2 4
model_comparison
GOATnote-Inc/scribegoat2 4
autoresearch Orchestrates end-to-end autonomous AI research projects using a two-loop architecture. The inner loop runs rapid experiment iterations with clear optimization targets. The outer loop synthesizes results, identifies patterns, and steers research direction. Routes to domain-specific skills for execution, supports continuous agent operation via Claude Code /loop and OpenClaw heartbeat, and produces research presentations and papers. Use when starting a research project, running autonomous experiments, or managing a multi-hypothesis research effort.
Orchestra-Research/AI-Research-SKILLs 6,644
optimizing-attention-flash Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster inference. Supports PyTorch native SDPA, flash-attn library, H100 FP8, and sliding window attention.
Orchestra-Research/AI-Research-SKILLs 6,644
quantizing-models-bitsandbytes Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4, FP4 formats, QLoRA training, and 8-bit optimizers. Works with HuggingFace Transformers.
Orchestra-Research/AI-Research-SKILLs 6,644
nemo-curator GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora.
Orchestra-Research/AI-Research-SKILLs 6,644
ray-data Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.
Orchestra-Research/AI-Research-SKILLs 6,644
dspy Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming
Orchestra-Research/AI-Research-SKILLs 6,644

1 2 3 4 5

Previous

Page 1 of 5