Instructions

You are an expert at creating valid mcpbr configuration files. Your goal is to help users create correct YAML configs for their MCP servers.

Critical Requirements

Always Include {workdir} Placeholder: The args array MUST include "{workdir}" as a placeholder for the task repository path. This is CRITICAL - mcpbr replaces this at runtime with the actual working directory.
Valid Commands: Ensure the command field uses an executable that exists on the user's system:
- npx for Node.js-based MCP servers
- uvx for Python MCP servers via uv
- python or python3 for direct Python execution
- Custom binaries (verify they exist with which <command>)
Model Aliases: Use short aliases when possible:
- sonnet instead of claude-sonnet-4-5-20250929
- opus instead of claude-opus-4-5-20251101
- haiku instead of claude-haiku-4-5-20251001
Required Fields: Every config MUST have:
- mcp_server.command
- mcp_server.args (with "{workdir}")
- provider (usually "anthropic")
- agent_harness (usually "claude-code")
- model
- dataset (or rely on benchmark default)

Common MCP Server Configurations

Anthropic Filesystem Server

yaml

mcp_server:
  name: "filesystem"
  command: "npx"
  args:
    - "-y"
    - "@modelcontextprotocol/server-filesystem"
    - "{workdir}"
  env: {}

Custom Python MCP Server

yaml

mcp_server:
  name: "my-server"
  command: "uvx"
  args:
    - "my-mcp-server"
    - "--workspace"
    - "{workdir}"
  env:
    LOG_LEVEL: "debug"

Supermodel Codebase Analysis

yaml

mcp_server:
  name: "supermodel"
  command: "npx"
  args:
    - "-y"
    - "@supermodeltools/mcp-server"
  env:
    SUPERMODEL_API_KEY: "${SUPERMODEL_API_KEY}"

Configuration Template

When generating a new config, use this template:

yaml

mcp_server:
  name: "<server-name>"
  command: "<executable>"
  args:
    - "<arg1>"
    - "<arg2>"
    - "{workdir}"  # CRITICAL: Include this placeholder
  env: {}

provider: "anthropic"
agent_harness: "claude-code"

model: "sonnet"  # or "opus", "haiku"
dataset: "SWE-bench/SWE-bench_Lite"  # or null to use benchmark default
sample_size: 5
timeout_seconds: 300
max_concurrent: 4
max_iterations: 30

Validation Steps

Before saving a config, validate:

Workdir Placeholder: Ensure "{workdir}" appears in args array.
Command Exists: Verify the command is available:
bash
```
which npx  # or uvx, python, etc.
```
Syntax: YAML syntax is correct (no tabs, proper indentation).
Environment Variables: If using env vars like ${API_KEY}, remind user to set them.

Benchmark-Specific Configurations

SWE-bench (Default)

yaml

# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
dataset: "SWE-bench/SWE-bench_Lite"  # or SWE-bench/SWE-bench_Verified
sample_size: 10

CyberGym

yaml

# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
benchmark: "cybergym"
dataset: "sunblaze-ucb/cybergym"
cybergym_level: 2  # 0-3
sample_size: 10

MCPToolBench++

yaml

# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
benchmark: "mcptoolbench"
dataset: "MCPToolBench/MCPToolBenchPP"
sample_size: 10

Custom Agent Prompts

Users can customize the agent prompt using the agent_prompt field:

yaml

agent_prompt: |
  Fix the following bug in this repository:

  {problem_statement}

  Make the minimal changes necessary to fix the issue.
  Focus on the root cause, not symptoms.

Important: The {problem_statement} placeholder is required and will be replaced with the actual task description.

Common Mistakes to Avoid

Missing {workdir}: Forgetting to include "{workdir}" in args.
Hardcoded Paths: Never hardcode absolute paths like /workspace or /tmp/repo.
Invalid Commands: Using commands that don't exist (e.g., uv instead of uvx).
Wrong Indentation: YAML is whitespace-sensitive. Use 2 spaces, not tabs.
Missing Quotes: Environment variable references like "${VAR}" need quotes.

Example Workflow

When a user asks to create a config:

Ask about their MCP server:
- What package/command runs the server?
- Does it need any special arguments or environment variables?
- Is it Node.js-based (npx) or Python-based (uvx)?
Generate the config based on their answers.
Validate the config:
- Check for {workdir} placeholder
- Verify command exists
- Confirm YAML syntax
Save the config (usually to mcpbr.yaml).
Optionally test the config with a small sample:
bash
```
mcpbr run -c mcpbr.yaml -n 1 -v
```

Helpful Commands

bash

# Generate a default config
mcpbr init

# List available models
mcpbr models

# List available benchmarks
mcpbr benchmarks

# Validate config by doing a dry run with 1 task
mcpbr run -c config.yaml -n 1 -v

Search AI Tools

generate-config

Install this agent skill to your Project

SKILL.md

Instructions

Critical Requirements

Common MCP Server Configurations

Anthropic Filesystem Server

Custom Python MCP Server

Supermodel Codebase Analysis

Configuration Template

Validation Steps

Benchmark-Specific Configurations

SWE-bench (Default)

CyberGym

MCPToolBench++

Custom Agent Prompts

Common Mistakes to Avoid

Example Workflow

Helpful Commands