Topic: ai-agents
18,135 skills in this topic.
-
reflect
Pattern recognition across your product decisions. Analyzes saved strategy sessions to surface themes, recurring risks, and suggested next steps.
breethomas/bette-think 13
-
shape-up
Shape work using the Shape Up methodology (Ryan Singer, Basecamp). Walk through the 4-step shaping process to create pitches ready for betting. Distinguishes between established product mode (fixed time, variable scope) and new product mode (looser constraints). Use when planning cycle work, writing pitches, or coaching PMs on shaping.
breethomas/bette-think 13
-
spec
Write specifications at the right depth for any project. Progressive disclosure from quick Linear issues to full AI feature specs. Embeds Linear Method philosophy (brevity, clarity, momentum) with context engineering for AI features. Use for any spec work - quick tasks, features, or AI products.
breethomas/bette-think 13
-
start-evals
Start AI evals without overengineering. Create your first 20 test cases in a spreadsheet using PM-Friendly Evals approach.
breethomas/bette-think 13
-
strategy-session
Your product soundboard. Work through product decisions conversationally - Claude gathers context, challenges assumptions, captures decisions, and creates Linear issues.
breethomas/bette-think 13
-
workspace-calibration
Analyze Linear workspace health and usage patterns before jumping into backlog work. Like a pre-flight check for a new PM joining a team or organization.
breethomas/bette-think 13
-
build-judge
Build an LLM-as-Judge evaluator for one specific failure mode. Binary pass/fail only. Use when a failure mode requires interpretation (tone, faithfulness, relevance, completeness) and cannot be checked with code. Do NOT use when the failure can be checked with regex, schema validation, or execution tests. Do NOT use before completing error analysis (/upgrade-evals).
breethomas/bette-think 13
-
eval-rag
Evaluate RAG pipeline retrieval and generation quality separately. Measure Recall@k, Precision@k, MRR, NDCG@k for retrieval. Assess faithfulness and relevance for generation. Use when the AI feature uses retrieval (search, knowledge base, document QA). Do NOT use for non-RAG AI features.
breethomas/bette-think 13
-
generate-test-data
Create diverse synthetic test inputs using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead).
breethomas/bette-think 13
-
upgrade-evals
Systematic error analysis on real AI traces. Read traces, judge pass/fail, let failure categories emerge from data, compute failure rates, decide what to fix. Use when you have 50+ test cases or are seeing production failures. Do NOT use when you have fewer than 20 test cases (use /start-evals first).
breethomas/bette-think 13
-
context-engineering
Strategies for managing LLM context windows effectively in AI agents. Use when building agents that handle long conversations, multi-step tasks, tool orchestration, or need to maintain coherence across extended interactions.
itsmostafa/llm-engineering-skills 17
-
lora
Parameter-efficient fine-tuning with Low-Rank Adaptation (LoRA). Use when fine-tuning large language models with limited GPU memory, creating task-specific adapters, or when you need to train multiple specialized models from a single base.
itsmostafa/llm-engineering-skills 17
-
mlx
Running and fine-tuning LLMs on Apple Silicon with MLX. Use when working with models locally on Mac, converting Hugging Face models to MLX format, fine-tuning with LoRA/QLoRA on Apple Silicon, or serving models via HTTP API.
itsmostafa/llm-engineering-skills 17
-
pytorch
Building and training neural networks with PyTorch. Use when implementing deep learning models, training loops, data pipelines, model optimization with torch.compile, distributed training, or deploying PyTorch models.
itsmostafa/llm-engineering-skills 17
-
agents
Patterns and architectures for building AI agents and workflows with LLMs. Use when designing systems that involve tool use, multi-step reasoning, autonomous decision-making, or orchestration of LLM-driven tasks.
itsmostafa/llm-engineering-skills 17
-
prompt-engineering
Crafting effective prompts for LLMs. Use when designing prompts, improving output quality, structuring complex instructions, or debugging poor model responses.
itsmostafa/llm-engineering-skills 17
-
qlora
Memory-efficient fine-tuning with 4-bit quantization and LoRA adapters. Use when fine-tuning large models (7B+) on consumer GPUs, when VRAM is limited, or when standard LoRA still exceeds memory. Builds on the lora skill.
itsmostafa/llm-engineering-skills 17
-
rlhf
Understanding Reinforcement Learning from Human Feedback (RLHF) for aligning language models. Use when learning about preference data, reward modeling, policy optimization, or direct alignment algorithms like DPO.
itsmostafa/llm-engineering-skills 17
-
transformers
Loading and using pretrained models with Hugging Face Transformers. Use when working with pretrained models from the Hub, running inference with Pipeline API, fine-tuning models with Trainer, or handling text, vision, audio, and multimodal tasks.
itsmostafa/llm-engineering-skills 17
-
golang-pro
Use when building Go applications requiring concurrent programming, microservices architecture, or high-performance systems. Invoke for goroutines, channels, Go generics, gRPC integration.
DeevsDeevs/agent-system 36
-
97-dev
Apply timeless programming wisdom from "97 Things Every Programmer Should Know" when writing, reviewing, or refactoring code. Use for design decisions, code quality checks, professional development guidance, testing strategies, and workflow optimization.
DeevsDeevs/agent-system 36
-
alpha-squad
Run multi-lens hypothesis generation across accounting, flows, networks, microstructure, and causal checks. Use when you need market hypotheses with mechanism, counterparty, and decay. Triggers: alpha squad, fundamentalist, vulture, network-architect, book-physicist, causal-detective, hypothesis generation, mechanism first.
DeevsDeevs/agent-system 36
-
anti-ai-slop
After working on the code, ensure the branch contains only the minimal, idiomatic changes by removing AI-generated slop introduced on this branch.
DeevsDeevs/agent-system 36
-
bug-hunters
Run systematic bug hunting with spec reconstruction, adversarial validation, and confidence scoring. Use when you want to hunt bugs (not fix them), validate correctness, or run logic-first/code-first investigations. Triggers: bug hunt, spec reconstruction, logic-first, code-first, orchestrator, logic-hunter, cpp-hunter, python-hunter.
DeevsDeevs/agent-system 36