Agent skill
generate-config
Generate and validate mcpbr configuration files for MCP server benchmarking.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/mcpbr-config
SKILL.md
Instructions
You are an expert at creating valid mcpbr configuration files. Your goal is to help users create correct YAML configs for their MCP servers.
Critical Requirements
-
Always Include {workdir} Placeholder: The
argsarray MUST include"{workdir}"as a placeholder for the task repository path. This is CRITICAL - mcpbr replaces this at runtime with the actual working directory. -
Valid Commands: Ensure the
commandfield uses an executable that exists on the user's system:npxfor Node.js-based MCP serversuvxfor Python MCP servers via uvpythonorpython3for direct Python execution- Custom binaries (verify they exist with
which <command>)
-
Model Aliases: Use short aliases when possible:
sonnetinstead ofclaude-sonnet-4-5-20250929opusinstead ofclaude-opus-4-5-20251101haikuinstead ofclaude-haiku-4-5-20251001
-
Required Fields: Every config MUST have:
mcp_server.commandmcp_server.args(with"{workdir}")provider(usually"anthropic")agent_harness(usually"claude-code")modeldataset(or rely on benchmark default)
Common MCP Server Configurations
Anthropic Filesystem Server
mcp_server:
name: "filesystem"
command: "npx"
args:
- "-y"
- "@modelcontextprotocol/server-filesystem"
- "{workdir}"
env: {}
Custom Python MCP Server
mcp_server:
name: "my-server"
command: "uvx"
args:
- "my-mcp-server"
- "--workspace"
- "{workdir}"
env:
LOG_LEVEL: "debug"
Supermodel Codebase Analysis
mcp_server:
name: "supermodel"
command: "npx"
args:
- "-y"
- "@supermodeltools/mcp-server"
env:
SUPERMODEL_API_KEY: "${SUPERMODEL_API_KEY}"
Configuration Template
When generating a new config, use this template:
mcp_server:
name: "<server-name>"
command: "<executable>"
args:
- "<arg1>"
- "<arg2>"
- "{workdir}" # CRITICAL: Include this placeholder
env: {}
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet" # or "opus", "haiku"
dataset: "SWE-bench/SWE-bench_Lite" # or null to use benchmark default
sample_size: 5
timeout_seconds: 300
max_concurrent: 4
max_iterations: 30
Validation Steps
Before saving a config, validate:
- Workdir Placeholder: Ensure
"{workdir}"appears inargsarray. - Command Exists: Verify the command is available:
bash
which npx # or uvx, python, etc. - Syntax: YAML syntax is correct (no tabs, proper indentation).
- Environment Variables: If using env vars like
${API_KEY}, remind user to set them.
Benchmark-Specific Configurations
SWE-bench (Default)
# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
dataset: "SWE-bench/SWE-bench_Lite" # or SWE-bench/SWE-bench_Verified
sample_size: 10
CyberGym
# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
benchmark: "cybergym"
dataset: "sunblaze-ucb/cybergym"
cybergym_level: 2 # 0-3
sample_size: 10
MCPToolBench++
# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
benchmark: "mcptoolbench"
dataset: "MCPToolBench/MCPToolBenchPP"
sample_size: 10
Custom Agent Prompts
Users can customize the agent prompt using the agent_prompt field:
agent_prompt: |
Fix the following bug in this repository:
{problem_statement}
Make the minimal changes necessary to fix the issue.
Focus on the root cause, not symptoms.
Important: The {problem_statement} placeholder is required and will be replaced with the actual task description.
Common Mistakes to Avoid
- Missing {workdir}: Forgetting to include
"{workdir}"in args. - Hardcoded Paths: Never hardcode absolute paths like
/workspaceor/tmp/repo. - Invalid Commands: Using commands that don't exist (e.g.,
uvinstead ofuvx). - Wrong Indentation: YAML is whitespace-sensitive. Use 2 spaces, not tabs.
- Missing Quotes: Environment variable references like
"${VAR}"need quotes.
Example Workflow
When a user asks to create a config:
-
Ask about their MCP server:
- What package/command runs the server?
- Does it need any special arguments or environment variables?
- Is it Node.js-based (npx) or Python-based (uvx)?
-
Generate the config based on their answers.
-
Validate the config:
- Check for
{workdir}placeholder - Verify command exists
- Confirm YAML syntax
- Check for
-
Save the config (usually to
mcpbr.yaml). -
Optionally test the config with a small sample:
bashmcpbr run -c mcpbr.yaml -n 1 -v
Helpful Commands
# Generate a default config
mcpbr init
# List available models
mcpbr models
# List available benchmarks
mcpbr benchmarks
# Validate config by doing a dry run with 1 task
mcpbr run -c config.yaml -n 1 -v
Didn't find tool you were looking for?