Agent skills
perplexity-data-handling

Agent skill

perplexity-data-handling

Implement Perplexity query sanitization, citation validation, result caching, and conversation context management for search workflows. Trigger with phrases like "perplexity data", "perplexity PII", "perplexity citations", "perplexity cache", "perplexity context".

View SKILL.md on GitHub Repository

Stars 1,803

Forks 241

Install this agent skill to your Project

npx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/main/plugins/saas-packs/perplexity-pack/skills/perplexity-data-handling

SKILL.md

Perplexity Data Handling

Overview

Manage data flowing through Perplexity Sonar API. Critical concern: queries are sent to Perplexity for web search, so any PII in queries is exposed to external infrastructure. Responses contain citations (third-party URLs) that must be validated before displaying to users.

Data Flow

User Input → Query Sanitization → Perplexity API → Response Parsing
                                                         │
                                           ┌─────────────┼──────────────┐
                                           │             │              │
                                      Answer Text    Citations    Search Results
                                           │             │              │
                                      Format &      Validate &    Store for
                                      Display       Deduplicate   Analytics

Prerequisites

Perplexity API key configured
Understanding of PII regulations (GDPR/CCPA)
Cache storage (Redis or in-memory)

Instructions

Step 1: Query Sanitization

typescript

function sanitizeQuery(query: string): { clean: string; redacted: boolean } {
  let clean = query;
  let redacted = false;

  const patterns: Array<[RegExp, string]> = [
    [/\b[\w.+-]+@[\w-]+\.[\w.]+\b/g, "[email]"],
    [/\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g, "[phone]"],
    [/\b\d{3}-\d{2}-\d{4}\b/g, "[ssn]"],
    [/\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g, "[card]"],
    [/\b(pplx-|sk-|pk_|sk_live_)\w{20,}\b/g, "[token]"],
    [/\b(user|customer|account)\s*#?\s*\d+\b/gi, "[id]"],
  ];

  for (const [pattern, replacement] of patterns) {
    if (pattern.test(clean)) {
      clean = clean.replace(pattern, replacement);
      redacted = true;
    }
  }

  return { clean, redacted };
}

async function safeSearch(rawQuery: string) {
  const { clean, redacted } = sanitizeQuery(rawQuery);
  if (redacted) {
    console.warn("[Data] PII redacted from Perplexity query");
  }

  return perplexity.chat.completions.create({
    model: "sonar",
    messages: [{ role: "user", content: clean }],
  });
}

Step 2: Citation Validation

typescript

interface ValidatedCitation {
  url: string;
  domain: string;
  valid: boolean;
  index: number;
}

function validateCitations(citations: string[]): ValidatedCitation[] {
  return citations.map((url, i) => {
    try {
      const parsed = new URL(url);
      return {
        url: url.replace(/[.,;:]+$/, ""),
        domain: parsed.hostname,
        valid: ["http:", "https:"].includes(parsed.protocol),
        index: i + 1,
      };
    } catch {
      return { url, domain: "unknown", valid: false, index: i + 1 };
    }
  });
}

function deduplicateCitations(citations: ValidatedCitation[]): ValidatedCitation[] {
  const seen = new Set<string>();
  return citations.filter((c) => {
    const normalized = c.url.split("?")[0].replace(/\/$/, "");
    if (seen.has(normalized)) return false;
    seen.add(normalized);
    return true;
  });
}

// Replace [1] markers with linked citations
function renderCitations(answer: string, citations: ValidatedCitation[]): string {
  let rendered = answer;
  for (const c of citations.filter((c) => c.valid)) {
    rendered = rendered.replaceAll(`[${c.index}]`, `[${c.index}](${c.url})`);
  }
  return rendered;
}

Step 3: Result Caching with Freshness Policy

typescript

import { LRUCache } from "lru-cache";
import { createHash } from "crypto";

interface CachedResult {
  answer: string;
  citations: ValidatedCitation[];
  cachedAt: number;
  model: string;
}

const CACHE_TTL: Record<string, number> = {
  news: 30 * 60_000,       // 30 min for breaking/current events
  research: 4 * 3600_000,  // 4 hours for research topics
  factual: 24 * 3600_000,  // 24 hours for stable facts
  default: 1 * 3600_000,   // 1 hour default
};

const resultCache = new LRUCache<string, CachedResult>({ max: 500 });

function detectQueryType(query: string): keyof typeof CACHE_TTL {
  if (/\b(latest|today|breaking|recent|this week)\b/i.test(query)) return "news";
  if (/\b(research|study|paper|analysis|compare)\b/i.test(query)) return "research";
  if (/\b(what is|define|how does|who is)\b/i.test(query)) return "factual";
  return "default";
}

async function cachedSearch(query: string, model = "sonar") {
  const hash = createHash("sha256")
    .update(`${model}:${query.toLowerCase().trim()}`)
    .digest("hex");

  const cached = resultCache.get(hash);
  if (cached) return { ...cached, fromCache: true };

  const response = await safeSearch(query);
  const rawCitations = (response as any).citations || [];
  const citations = deduplicateCitations(validateCitations(rawCitations));
  const queryType = detectQueryType(query);

  const entry: CachedResult = {
    answer: response.choices[0].message.content || "",
    citations,
    cachedAt: Date.now(),
    model: response.model,
  };

  resultCache.set(hash, entry, { ttl: CACHE_TTL[queryType] });
  return { ...entry, fromCache: false };
}

Step 4: Conversation Context Management

typescript

import OpenAI from "openai";

type Message = OpenAI.ChatCompletionMessageParam;

class SearchContext {
  private messages: Message[] = [];
  private readonly maxMessages = 10;
  private readonly maxEstimatedTokens = 8000;

  constructor(systemPrompt?: string) {
    if (systemPrompt) {
      this.messages.push({ role: "system", content: systemPrompt });
    }
  }

  addUserMessage(content: string) {
    this.messages.push({ role: "user", content });
    this.trim();
  }

  addAssistantMessage(content: string) {
    this.messages.push({ role: "assistant", content });
    this.trim();
  }

  getMessages(): Message[] {
    return [...this.messages];
  }

  private trim() {
    // Keep system prompt + last N messages
    while (this.messages.length > this.maxMessages) {
      const systemIdx = this.messages[0].role === "system" ? 1 : 0;
      this.messages.splice(systemIdx, 1);
    }

    // Trim if estimated tokens too high
    while (this.estimateTokens() > this.maxEstimatedTokens && this.messages.length > 2) {
      const systemIdx = this.messages[0].role === "system" ? 1 : 0;
      this.messages.splice(systemIdx, 1);
    }
  }

  private estimateTokens(): number {
    return this.messages.reduce(
      (sum, m) => sum + Math.ceil(String(m.content).length / 4),
      0
    );
  }

  clear() {
    const system = this.messages.find((m) => m.role === "system");
    this.messages = system ? [system] : [];
  }
}

Error Handling

Issue	Cause	Solution
PII in search query	User entered personal data	Apply `sanitizeQuery` before API call
Broken citation URLs	Source page moved/deleted	Validate URLs, filter invalid ones
Stale cached results	TTL too long for news	Use query-type-aware TTL
Context overflow	Too many conversation turns	Automatic trimming in SearchContext
Duplicate citations	Same source cited multiple times	Deduplicate by normalized URL

Output

Query sanitization stripping PII before API calls
Citation validation and deduplication
Cache with query-type-aware TTL
Conversation context with automatic trimming

Resources

Next Steps

For access control, see perplexity-enterprise-rbac.

Maintainer

jeremylongshore Core maintainer

Source details

Full Name: jeremylongshore/claude-code-plugins-plus-skills
Branch: main
Path in repo: plugins/saas-packs/perplexity-pack/skills/perplexity-data-handling
License: Other
Topics: ai claude-code anthropic agent-skills automation mcp ai-agents developer-tools skills llm marketplace saas claude-code-plugins devops plugin-marketplace plugin-system

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

jeremylongshore/claude-code-plugins-plus-skills

dockerfile-generator

Dockerfile Generator - Auto-activating skill for DevOps Basics. Triggers on: dockerfile generator, dockerfile generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

branch-naming-helper

Branch Naming Helper - Auto-activating skill for DevOps Basics. Triggers on: branch naming helper, branch naming helper Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

readme-generator

Readme Generator - Auto-activating skill for DevOps Basics. Triggers on: readme generator, readme generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

makefile-generator

Makefile Generator - Auto-activating skill for DevOps Basics. Triggers on: makefile generator, makefile generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

gitignore-generator

Gitignore Generator - Auto-activating skill for DevOps Basics. Triggers on: gitignore generator, gitignore generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

pre-commit-hook-setup

Pre Commit Hook Setup - Auto-activating skill for DevOps Basics. Triggers on: pre commit hook setup, pre commit hook setup Part of the DevOps Basics skill category.

1,803 241

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Perplexity Data Handling

Overview

Data Flow

Prerequisites

Instructions

Step 1: Query Sanitization

Step 2: Citation Validation

Step 3: Result Caching with Freshness Policy

Step 4: Conversation Context Management

Error Handling

Output

Resources

Next Steps

Recommended Agent Skills

dockerfile-generator

branch-naming-helper

readme-generator

makefile-generator

gitignore-generator

pre-commit-hook-setup