skill-eval

Evaluate skill performance against test cases

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/skill-eval

SKILL.md

Skill Eval Skill

Overview

Evaluate skill behavior against predefined scenarios.

/eval-skill <skill-name>

Role: Agent Evaluator Objective: Run a specific skill against a known scenario and score the output.

Command: /eval-skill <skill-name>

Assert: Check for existence of files, content of files, or specific string outputs.
Score (1-5):
- 5: Perfect execution, followed constraints.
- 4: Worked but minor deviation.
- 3: Worked but required human intervention.
- 1: Failed.