Agent skill

bernstein-quality

Show quality metrics for Bernstein runs — success rates per model, lint/test pass rates, completion time distributions. Use when the user asks about quality, reliability, which model performs best, or pass rates.

Stars 104
Forks 15

Install this agent skill to your Project

npx add-skill https://github.com/chernistry/bernstein/tree/main/packages/cursor-plugin/skills/bernstein-quality

SKILL.md

Bernstein Quality Metrics

Analyze quality and reliability of agent-generated code.

When to Use

  • User asks "how reliable are the agents?" or "which model is best?"
  • User wants success rates, pass rates, or completion time stats
  • User asks about test failures or lint issues across models
  • User says "show me quality metrics"

Instructions

  1. Run scripts/quality.sh metrics for overall quality metrics.

  2. Run scripts/quality.sh pass-rates for lint/typecheck/test pass rates by model.

  3. Run scripts/quality.sh times for completion time distributions.

  4. Present a quality dashboard:

## Quality Dashboard

### Success Rate by Model
| Model | Tasks | Success | Fail | Rate |
|-------|-------|---------|------|------|
| claude-sonnet-4 | 24 | 22 | 2 | 91.7% |
| gpt-4.1 | 12 | 10 | 2 | 83.3% |

### Pass Rates
| Check | Overall | claude-sonnet-4 | gpt-4.1 |
|-------|---------|-----------------|---------|
| Lint | 96% | 98% | 92% |
| Type-check | 88% | 91% | 83% |
| Tests | 85% | 89% | 75% |

### Completion Times
| Percentile | Time |
|------------|------|
| p50 | 3m 20s |
| p90 | 8m 45s |
| p99 | 15m 12s |
  1. Highlight any models with significantly lower pass rates.
  2. Recommend model routing adjustments if one model consistently underperforms.

Expand your agent's capabilities with these related and highly-rated skills.

chernistry/bernstein

bernstein-cost

Show detailed cost breakdown and budget status for the Bernstein orchestrator. Use when the user asks about spending, budget, cost per model, cost per agent, or wants a cost projection.

104 15
Explore
chernistry/bernstein

bernstein-create-task

Create a new task in the Bernstein orchestrator. Use when the user wants to add a task, delegate work to an agent, file a bug fix, or queue up work for the orchestrator to handle.

104 15
Explore
chernistry/bernstein

bernstein-status

Show Bernstein orchestrator status — active agents, task progress, costs, and alerts. Use when the user asks about orchestrator status, what agents are doing, task progress, how much has been spent, or what's happening with the build.

104 15
Explore
chernistry/bernstein

bernstein-agents

Manage Bernstein agents — list active agents, inspect their output, kill stalled agents, or stream live logs. Use when the user asks about agents, wants to see what an agent is doing, or needs to kill one.

104 15
Explore
chernistry/bernstein

bernstein-plan

Create and manage multi-step execution plans in Bernstein. Plans decompose complex goals into stages with dependencies. Use when the user wants to plan a complex feature, break down a large task, or review an execution plan before agents start working.

104 15
Explore
chernistry/bernstein

bernstein-alerts

Show active alerts from Bernstein — failed tasks, stalled agents, budget warnings, blocked tasks needing human intervention. Use when the user asks about problems, errors, warnings, or what needs attention.

104 15
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results