Agent skills
perplexity-cost-tuning

Agent skill

perplexity-cost-tuning

Optimize Perplexity costs through model routing, caching, token limits, and budget monitoring. Use when analyzing Perplexity billing, reducing API costs, or implementing budget alerts for Perplexity Sonar API. Trigger with phrases like "perplexity cost", "perplexity billing", "reduce perplexity costs", "perplexity pricing", "perplexity budget".

View SKILL.md on GitHub Repository

Stars 1,803

Forks 241

Install this agent skill to your Project

npx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/main/plugins/saas-packs/perplexity-pack/skills/perplexity-cost-tuning

SKILL.md

Perplexity Cost Tuning

Overview

Reduce Perplexity Sonar API costs. Perplexity charges per-token (input + output) plus a per-request fee that varies by search context size. The biggest cost lever is model selection: sonar-pro costs 3-15x more than sonar per request.

Pricing Reference

Model	Input $/M tokens	Output $/M tokens	Request Fee
`sonar`	$1	$1	$5 per 1K requests
`sonar-pro`	$3	$15	$5 per 1K requests
`sonar-reasoning-pro`	$3	$15	$5 per 1K requests
`sonar-deep-research`	$2	$8	$5 per 1K searches

Search context size (Low/Medium/High) affects the request fee. More context = higher fee.

Prerequisites

Perplexity API account with usage dashboard
Understanding of query patterns in your application
Cache infrastructure for search results

Instructions

Step 1: Route Queries to the Right Model

typescript

// 60-70% of queries can use sonar, saving 3-15x per query
function selectModel(query: string): "sonar" | "sonar-pro" {
  const simplePatterns = [
    /^what is/i, /^define/i, /^who is/i, /^when did/i,
    /current price/i, /^how many/i, /^is it true/i,
  ];
  if (simplePatterns.some((p) => p.test(query))) return "sonar";

  const complexPatterns = [
    /compare.*vs/i, /analysis of/i, /comprehensive/i,
    /pros and cons/i, /in-depth/i, /research/i,
  ];
  if (complexPatterns.some((p) => p.test(query))) return "sonar-pro";

  return "sonar"; // Default to cheapest
}

Step 2: Limit Output Tokens

bash

set -euo pipefail
# Factual queries need ~100 tokens, not 4096
# Setting max_tokens dramatically reduces output costs

# Simple fact: 100 tokens = $0.0001 output
curl -X POST https://api.perplexity.ai/chat/completions \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sonar",
    "messages": [{"role": "user", "content": "Current population of Tokyo"}],
    "max_tokens": 100
  }'

# Research query: keep at 2048 only when needed
curl -X POST https://api.perplexity.ai/chat/completions \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sonar-pro",
    "messages": [{"role": "user", "content": "Compare React vs Vue in 2025 for enterprise apps"}],
    "max_tokens": 2048
  }'

Step 3: Cache to Eliminate Duplicate Queries

typescript

import { LRUCache } from "lru-cache";
import { createHash } from "crypto";

const searchCache = new LRUCache<string, any>({
  max: 10000,
  ttl: 4 * 3600_000, // 4-hour default TTL
});

async function cachedQuery(query: string, model: string) {
  const key = createHash("sha256")
    .update(`${model}:${query.toLowerCase().trim()}`)
    .digest("hex");

  const cached = searchCache.get(key);
  if (cached) return cached; // $0 cost

  const result = await perplexity.chat.completions.create({
    model,
    messages: [{ role: "user", content: query }],
  });
  searchCache.set(key, result);
  return result;
}

// Track cache effectiveness
function cacheStats() {
  return {
    size: searchCache.size,
    hitRate: `${((searchCache as any).hits / ((searchCache as any).hits + (searchCache as any).misses) * 100).toFixed(1)}%`,
  };
}

Step 4: Use Domain Filters to Reduce Search Cost

bash

set -euo pipefail
# Restricting search domains = less content to process = lower request fee
curl -X POST https://api.perplexity.ai/chat/completions \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sonar",
    "messages": [{"role": "user", "content": "Python 3.13 release notes"}],
    "search_domain_filter": ["python.org", "docs.python.org"],
    "max_tokens": 500
  }'

Step 5: Track and Budget

typescript

class CostTracker {
  private costs: Array<{ model: string; tokens: number; timestamp: Date }> = [];

  record(model: string, usage: { total_tokens: number }) {
    this.costs.push({
      model,
      tokens: usage.total_tokens,
      timestamp: new Date(),
    });
  }

  dailySummary() {
    const today = this.costs.filter(
      (c) => c.timestamp.toDateString() === new Date().toDateString()
    );
    const sonarTokens = today.filter((c) => c.model === "sonar").reduce((s, c) => s + c.tokens, 0);
    const proTokens = today.filter((c) => c.model === "sonar-pro").reduce((s, c) => s + c.tokens, 0);

    return {
      queries: today.length,
      estimatedCost: (sonarTokens * 0.000001) + (proTokens * 0.000009), // rough estimate
      sonarQueries: today.filter((c) => c.model === "sonar").length,
      proQueries: today.filter((c) => c.model === "sonar-pro").length,
    };
  }
}

Cost Optimization Checklist

Default model is sonar (not sonar-pro)
max_tokens set on every request
Caching enabled for repeated queries
Model routing by query complexity
Domain filter used where applicable
Monthly budget cap set on API key
Cost tracking in production monitoring

Error Handling

Issue	Cause	Solution
High cost per query	Using sonar-pro for everything	Route simple queries to sonar
Low cache hit rate	Queries too unique	Normalize queries before hashing
Budget exhausted early	No spending caps	Set monthly budget on API key
Unexpectedly high bill	No max_tokens limits	Set max_tokens on all requests

Output

Model routing saving 60-70% on simple queries
Token limiting reducing output costs
Caching eliminating duplicate query costs
Cost tracking for budget monitoring

Resources

Next Steps

For architecture patterns, see perplexity-reference-architecture.

Maintainer

jeremylongshore Core maintainer

Source details

Full Name: jeremylongshore/claude-code-plugins-plus-skills
Branch: main
Path in repo: plugins/saas-packs/perplexity-pack/skills/perplexity-cost-tuning
License: Other
Topics: ai claude-code anthropic agent-skills automation mcp ai-agents developer-tools skills llm marketplace saas claude-code-plugins devops plugin-marketplace plugin-system

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

jeremylongshore/claude-code-plugins-plus-skills

dockerfile-generator

Dockerfile Generator - Auto-activating skill for DevOps Basics. Triggers on: dockerfile generator, dockerfile generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

branch-naming-helper

Branch Naming Helper - Auto-activating skill for DevOps Basics. Triggers on: branch naming helper, branch naming helper Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

readme-generator

Readme Generator - Auto-activating skill for DevOps Basics. Triggers on: readme generator, readme generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

makefile-generator

Makefile Generator - Auto-activating skill for DevOps Basics. Triggers on: makefile generator, makefile generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

gitignore-generator

Gitignore Generator - Auto-activating skill for DevOps Basics. Triggers on: gitignore generator, gitignore generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

pre-commit-hook-setup

Pre Commit Hook Setup - Auto-activating skill for DevOps Basics. Triggers on: pre commit hook setup, pre commit hook setup Part of the DevOps Basics skill category.

1,803 241

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Perplexity Cost Tuning

Overview

Pricing Reference

Prerequisites

Instructions

Step 1: Route Queries to the Right Model

Step 2: Limit Output Tokens

Step 3: Cache to Eliminate Duplicate Queries

Step 4: Use Domain Filters to Reduce Search Cost

Step 5: Track and Budget

Cost Optimization Checklist

Error Handling

Output

Resources

Next Steps

Recommended Agent Skills

dockerfile-generator

branch-naming-helper

readme-generator

makefile-generator

gitignore-generator

pre-commit-hook-setup