Agent skills
auto-router-patterns

Agent skill

auto-router-patterns

Auto router patterns for this project. Intelligent model selection via task classification, cost tier diversity, high-stakes override, weighted tier selection. Triggers on "auto router", "model selection", "classification", "cost tier", "exploration", "high stakes", "routing", "router".

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/auto-router-patterns

SKILL.md

Auto Router Patterns

System intelligently routes user messages to optimal models via task classification, cost-based diversity, and high-stakes safety overrides.

Classification with gpt-oss-120b

Router uses gpt-oss-120b via Cerebras for fast classification (~1000 tokens/sec):

typescript

// From autoRouter.ts
function getRouterModel() {
  const openai = createOpenAI({
    apiKey: process.env.AI_GATEWAY_API_KEY,
    baseURL: "https://gateway.ai.cloudflare.com/v1/planetaryescape/blah-chat-dev-gateway/openai",
  });
  return openai("gpt-oss-120b"); // Via Cerebras
}

const ROUTER_MODEL_ID = "openai:gpt-oss-120b";

Classification schema with generateObject:

typescript

// From autoRouter.ts
const classificationSchema = z.object({
  primaryCategory: z.enum(TASK_CATEGORIES),
  secondaryCategory: z.enum(TASK_CATEGORIES).optional().nullable(),
  complexity: z.enum(["simple", "moderate", "complex"]),
  requiresVision: z.boolean(),
  requiresLongContext: z.boolean(),
  requiresReasoning: z.boolean(),
  confidence: z.number().min(0).max(1),
  isHighStakes: z.boolean(),
  highStakesDomain: z.enum(HIGH_STAKES_DOMAINS).optional().nullable(),
});

Task categories: coding, reasoning, creative, factual, analysis, conversation, multimodal, research.

Cost Tier Categorization

Models categorized by average pricing (input + output / 2):

typescript

// From autoRouter.ts
type CostTier = "cheap" | "mid" | "premium";

function getCostTier(pricing: { input: number; output: number }): CostTier {
  const avgCost = (pricing.input + pricing.output) / 2;
  if (avgCost < 1.0) return "cheap";
  if (avgCost < 5.0) return "mid";
  return "premium";
}

Examples:

Cheap: gpt-5-nano ($0.04/$0.16), gemini-2.0-flash ($0.1/$0.4)
Mid: gpt-5-mini ($0.15/$0.6), claude-3.5-haiku ($0.8/$4.0)
Premium: gpt-5 ($2.5/$10.0), claude-opus-4 ($15.0/$75.0)

Weighted Tier Selection by Complexity

Diversity via weighted random selection, NOT top-N then random:

typescript

// From autoRouter.ts
const TIER_WEIGHTS: Record<string, Record<CostTier, number>> = {
  simple: { cheap: 0.6, mid: 0.25, premium: 0.15 },
  moderate: { cheap: 0.5, mid: 0.3, premium: 0.2 },
  complex: { cheap: 0.3, mid: 0.4, premium: 0.3 },
};

Critical: Groups ALL models by tier, not just top N. Simple tasks get cheap models 60% of time, premium 15%.

Selection logic:

typescript

// From autoRouter.ts
function selectWithExploration(
  scoredModels: Array<{ modelId: string; score: number }>,
  classification: { complexity: string; isHighStakes?: boolean },
) {
  // Group ALL models by tier
  const tiers: Record<CostTier, Array<{ modelId: string; score: number }>> = {
    cheap: [], mid: [], premium: [],
  };

  for (const model of sorted) {
    const tier = getCostTier(MODEL_CONFIG[model.modelId].pricing);
    tiers[tier].push(model);
  }

  // Get weights for complexity
  const weights = TIER_WEIGHTS[classification.complexity] ?? TIER_WEIGHTS.simple;
  const roll = Math.random();

  // Select tier based on weighted random
  if (roll < weights.cheap && tiers.cheap.length > 0) {
    selectedTier = "cheap";
  } else if (roll < weights.cheap + weights.mid && tiers.mid.length > 0) {
    selectedTier = "mid";
    explorationPick = true;
  } else if (tiers.premium.length > 0) {
    selectedTier = "premium";
    explorationPick = true;
  }

  // Random selection within chosen tier
  const pool = tiers[selectedTier];
  return pool[Math.floor(Math.random() * pool.length)];
}

High-Stakes Override

Medical/legal/financial/safety questions force premium tier for accuracy:

typescript

// From autoRouter.ts
const HIGH_STAKES_DOMAINS = [
  "medical", "legal", "financial", "safety",
  "mental_health", "privacy", "immigration", "domestic_abuse",
] as const;

// HIGH-STAKES OVERRIDE at top of selectWithExploration
if (classification.isHighStakes) {
  if (tiers.premium.length > 0) {
    const pool = tiers.premium;
    const picked = pool[Math.floor(Math.random() * pool.length)];
    return { ...picked, explorationPick: false };
  }
  // Fallback with warning if no premium models
  logger.warn("High-stakes query but no premium models available");
  return { ...sorted[0], explorationPick: false };
}

Classification prompt emphasizes advice vs information:

typescript

// From routerPrompts.ts
RULES:
1. Must seek ADVICE or ACTION, not just information
2. "What is a heart attack?" = NOT high stakes (educational)
3. "Am I having a heart attack?" = HIGH STAKES (medical)
4. "What does liability mean?" = NOT high stakes (definition)
5. "Can my employer fire me for this?" = HIGH STAKES (legal)

Diversity vs Top-Score Trade-off

System balances model quality with cost/speed diversity:

Scoring phase (autoRouter.ts):

Base: category match score (0-100 from MODEL_PROFILES)
Secondary category bonus: +30% of secondary score
Complexity: simple tasks penalized 0.7x (prefer cheap), complex boosted 1.2x (prefer capable)
Cost bias: -(avgCost / 30) * (costBias / 100) * 20
Speed bias: +speedBonus * (speedBias / 100)
Stickiness: +25 if model already selected in conversation
Reasoning bonus: +15 if task requires thinking and model has it
Research bonus: +25 for Perplexity models on research tasks

Selection phase (selectWithExploration):

NOT greedy (always top score)
NOT pure random (chaos)
Weighted probabilistic by cost tier AND complexity
Ensures variety across conversations without sacrificing appropriateness

Excluded Models Tracking for Retries

Failed models excluded from retry attempts:

typescript

// From autoRouter.ts routeMessage args
export const routeMessage = internalAction({
  args: {
    // ...
    excludedModels: v.optional(v.array(v.string())), // Failed models
  },
  handler: async (ctx, args) => {
    // Filter eligible models
    const eligibleModels = getEligibleModels(
      classification,
      args.currentContextTokens ?? 0,
      args.excludedModels, // ← Passed to filter
    );

    // Check if all models exhausted
    if (eligibleModels.length === 0) {
      const fallbackModel = "openai:gpt-5-mini";
      if (args.excludedModels?.includes(fallbackModel)) {
        throw new Error("All models exhausted including fallback");
      }
      return { selectedModelId: fallbackModel, /* ... */ };
    }
  },
});

function getEligibleModels(
  classification: TaskClassification,
  currentContextTokens: number,
  excludedModels?: string[],
): string[] {
  return Object.keys(MODEL_CONFIG).filter((modelId) => {
    // Exclude failed models from retry attempts
    if (excludedModels?.includes(modelId)) return false;
    // ... other filters
  });
}

Caller (chat.ts or generation retry logic) tracks failed models and passes them to router.

Key Files

packages/backend/convex/ai/autoRouter.ts - Main routing action, classification, selection
packages/backend/convex/ai/modelProfiles.ts - MODEL_CONFIG, MODEL_PROFILES, category scores
packages/backend/convex/ai/routerPrompts.ts - Classification prompt, reasoning template
packages/backend/convex/chat.ts - Calls routeMessage when user has "auto" selected
packages/backend/convex/generation.ts - Retry logic with excludedModels

Avoid

Don't use top-N greedy selection - breaks diversity
Don't skip high-stakes override - safety critical
Don't hardcode tier weights - use complexity-based config
Don't forget to track router usage for cost monitoring (recordTextGeneration)
Don't use classification model for generation - gpt-oss-120b is internal-only

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/auto-router-patterns
License: MIT License

Featured Tools

Join Our Newsletter

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Auto Router Patterns

Classification with gpt-oss-120b

Cost Tier Categorization

Weighted Tier Selection by Complexity

High-Stakes Override

Diversity vs Top-Score Trade-off

Excluded Models Tracking for Retries

Key Files

Avoid

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state