Agent skill

generate-config

Generate and validate mcpbr configuration files for MCP server benchmarking.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/mcpbr-config

SKILL.md

Instructions

You are an expert at creating valid mcpbr configuration files. Your goal is to help users create correct YAML configs for their MCP servers.

Critical Requirements

  1. Always Include {workdir} Placeholder: The args array MUST include "{workdir}" as a placeholder for the task repository path. This is CRITICAL - mcpbr replaces this at runtime with the actual working directory.

  2. Valid Commands: Ensure the command field uses an executable that exists on the user's system:

    • npx for Node.js-based MCP servers
    • uvx for Python MCP servers via uv
    • python or python3 for direct Python execution
    • Custom binaries (verify they exist with which <command>)
  3. Model Aliases: Use short aliases when possible:

    • sonnet instead of claude-sonnet-4-5-20250929
    • opus instead of claude-opus-4-5-20251101
    • haiku instead of claude-haiku-4-5-20251001
  4. Required Fields: Every config MUST have:

    • mcp_server.command
    • mcp_server.args (with "{workdir}")
    • provider (usually "anthropic")
    • agent_harness (usually "claude-code")
    • model
    • dataset (or rely on benchmark default)

Common MCP Server Configurations

Anthropic Filesystem Server

yaml
mcp_server:
  name: "filesystem"
  command: "npx"
  args:
    - "-y"
    - "@modelcontextprotocol/server-filesystem"
    - "{workdir}"
  env: {}

Custom Python MCP Server

yaml
mcp_server:
  name: "my-server"
  command: "uvx"
  args:
    - "my-mcp-server"
    - "--workspace"
    - "{workdir}"
  env:
    LOG_LEVEL: "debug"

Supermodel Codebase Analysis

yaml
mcp_server:
  name: "supermodel"
  command: "npx"
  args:
    - "-y"
    - "@supermodeltools/mcp-server"
  env:
    SUPERMODEL_API_KEY: "${SUPERMODEL_API_KEY}"

Configuration Template

When generating a new config, use this template:

yaml
mcp_server:
  name: "<server-name>"
  command: "<executable>"
  args:
    - "<arg1>"
    - "<arg2>"
    - "{workdir}"  # CRITICAL: Include this placeholder
  env: {}

provider: "anthropic"
agent_harness: "claude-code"

model: "sonnet"  # or "opus", "haiku"
dataset: "SWE-bench/SWE-bench_Lite"  # or null to use benchmark default
sample_size: 5
timeout_seconds: 300
max_concurrent: 4
max_iterations: 30

Validation Steps

Before saving a config, validate:

  1. Workdir Placeholder: Ensure "{workdir}" appears in args array.
  2. Command Exists: Verify the command is available:
    bash
    which npx  # or uvx, python, etc.
    
  3. Syntax: YAML syntax is correct (no tabs, proper indentation).
  4. Environment Variables: If using env vars like ${API_KEY}, remind user to set them.

Benchmark-Specific Configurations

SWE-bench (Default)

yaml
# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
dataset: "SWE-bench/SWE-bench_Lite"  # or SWE-bench/SWE-bench_Verified
sample_size: 10

CyberGym

yaml
# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
benchmark: "cybergym"
dataset: "sunblaze-ucb/cybergym"
cybergym_level: 2  # 0-3
sample_size: 10

MCPToolBench++

yaml
# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
benchmark: "mcptoolbench"
dataset: "MCPToolBench/MCPToolBenchPP"
sample_size: 10

Custom Agent Prompts

Users can customize the agent prompt using the agent_prompt field:

yaml
agent_prompt: |
  Fix the following bug in this repository:

  {problem_statement}

  Make the minimal changes necessary to fix the issue.
  Focus on the root cause, not symptoms.

Important: The {problem_statement} placeholder is required and will be replaced with the actual task description.

Common Mistakes to Avoid

  1. Missing {workdir}: Forgetting to include "{workdir}" in args.
  2. Hardcoded Paths: Never hardcode absolute paths like /workspace or /tmp/repo.
  3. Invalid Commands: Using commands that don't exist (e.g., uv instead of uvx).
  4. Wrong Indentation: YAML is whitespace-sensitive. Use 2 spaces, not tabs.
  5. Missing Quotes: Environment variable references like "${VAR}" need quotes.

Example Workflow

When a user asks to create a config:

  1. Ask about their MCP server:

    • What package/command runs the server?
    • Does it need any special arguments or environment variables?
    • Is it Node.js-based (npx) or Python-based (uvx)?
  2. Generate the config based on their answers.

  3. Validate the config:

    • Check for {workdir} placeholder
    • Verify command exists
    • Confirm YAML syntax
  4. Save the config (usually to mcpbr.yaml).

  5. Optionally test the config with a small sample:

    bash
    mcpbr run -c mcpbr.yaml -n 1 -v
    

Helpful Commands

bash
# Generate a default config
mcpbr init

# List available models
mcpbr models

# List available benchmarks
mcpbr benchmarks

# Validate config by doing a dry run with 1 task
mcpbr run -c config.yaml -n 1 -v

Didn't find tool you were looking for?

Be as detailed as possible for better results