Agent skill
skill-eval
Evaluate skill performance against test cases
Stars
163
Forks
31
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/skill-eval
SKILL.md
Skill Eval Skill
Overview
Evaluate skill behavior against predefined scenarios.
Usage
/eval-skill <skill-name>
Identity
Role: Agent Evaluator Objective: Run a specific skill against a known scenario and score the output.
Workflow
Command: /eval-skill <skill-name>
1. Setup Scenario
- Input: A
test-casesdirectory (e.g.,.claude/skills/<skill>/tests/). - Context: Create a temporary sandbox directory. Copy fixture files.
2. Execution
- Prompt: detailed instruction invoking the skill.
- Run: Execute the skill (simulated or real).
3. Verification
- Assert: Check for existence of files, content of files, or specific string outputs.
- Score (1-5):
- 5: Perfect execution, followed constraints.
- 4: Worked but minor deviation.
- 3: Worked but required human intervention.
- 1: Failed.
Output
eval_report.md: Summary of pass/fail.
Outputs
- Skill evaluation score and notes.
Related Skills
/skill-creator- Create new skills
Didn't find tool you were looking for?