Agent skill
evaluator-optimizer
Iterative refinement workflow for polishing code, documentation, or designs through systematic evaluation and improvement cycles. Use when refining drafts into production-grade quality.
Install this agent skill to your Project
npx add-skill https://github.com/NickCrew/Claude-Cortex/tree/main/skills/evaluator-optimizer
SKILL.md
Evaluator-Optimizer
Iterative refinement workflow that takes existing code, documentation, or designs and polishes them through rigorous cycles of evaluation and improvement until they meet production-grade quality standards.
When to Use This Skill
- Refining a rough draft of code into production quality
- Polishing documentation for clarity, completeness, and accuracy
- Iteratively improving a design or architecture proposal
- Systematic quality improvement where "good enough" is not sufficient
- When you need to converge on high quality through structured iteration
Quick Reference
| Task | Load reference |
|---|---|
| Evaluation criteria and quality rubrics | skills/evaluator-optimizer/references/evaluation-criteria.md |
Workflow: The Loop
For any given artifact (code, text, design):
- Accept: Take the current version of the artifact.
- Evaluate: Act as a harsh critic. Rate the artifact on correctness, clarity, efficiency, style, and safety. Assign a score out of 100.
- Decide:
- Score >= 90: Stop and present the result.
- Score < 90: Refine.
- Refine: Rewrite the artifact, specifically addressing the critique from step 2. List what changed and why.
- Repeat: Return to step 2 with the new version.
Behavioral Rules
- Do not settle: "Good enough" is not good enough. You are here to polish.
- Be explicit: When evaluating, list specific flaws. "The function
process_datais O(n^2) but could be O(n)." - Show your work: Summarize changes in each iteration.
- Self-correct: If a refinement breaks something, revert and try a different approach.
- Converge: Each iteration must improve the score. If two consecutive iterations do not improve the score, stop and present the best version.
Iteration Output Template
## Iteration [N] Evaluation
| Criterion | Score (1-10) | Notes |
|-----------|-------------|-------|
| Correctness | | |
| Clarity | | |
| Efficiency | | |
| Style | | |
| Safety | | |
| **Total** | **/50** | **[x100/50]** |
### Issues Found
1. [Specific issue with location]
2. [Specific issue with location]
### Refinements Applied
- [Change 1 and rationale]
- [Change 2 and rationale]
Example Interaction
Input: "Refine this Python script."
Iteration 1 Evaluation:
- Functionality: Good
- Efficiency: Poor - uses nested loops for matching
- Style: Variable names
aandbare unclear - Score: 60/100
Refinements applied:
- Flattened loops using a set lookup (O(n))
- Renamed
atousers,btoactive_ids - Added type hints
Iteration 2 Evaluation:
- Functionality: Good
- Efficiency: Excellent
- Style: Good
- Score: 95/100
Result: Present the refined script.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
claude-consult
Consult Claude specialist agents during implementation for codebase understanding, pattern checking, security review, debugging help, and more. Use this skill whenever you're unsure about conventions, stuck on a failure, or need expert input before writing code. Does not replace the formal review gates in agent-loops — this is for mid-implementation consultation.
doc-quality-review
Assess documentation quality across readability, consistency, audience fit, and prose clarity. Produces a scored review with actionable findings. This skill should be used before releases, during doc reviews, or when documentation feels unclear or inconsistent.
event-driven-architecture
Event-driven architecture patterns with event sourcing, CQRS, and message-driven communication. Use when designing distributed systems, microservices communication, or systems requiring eventual consistency and scalability.
prompt-engineering
Optimize prompts for LLMs and AI systems with structured techniques, evaluation patterns, and synthetic test data generation. Use when building AI features, improving agent performance, or crafting system prompts.
compliance-audit
Regulatory compliance auditing across GDPR, HIPAA, PCI DSS, SOC 2, and ISO frameworks with automated evidence collection and gap analysis. Use when conducting compliance assessments, preparing for certifications, or implementing regulatory controls.
react-performance-optimization
React performance optimization patterns using memoization, code splitting, and efficient rendering strategies. Use when optimizing slow React applications, reducing bundle size, or improving user experience with large datasets.
Didn't find tool you were looking for?