Sponsored by

Find leads on Reddit on auto pilot

Agent skills
llm-security

Agent skill

llm-security

Security guidelines for LLM applications based on OWASP Top 10 for LLM 2025. Use when building LLM apps, reviewing AI security, implementing RAG systems, or asking about LLM vulnerabilities like 'prompt injection' or 'check LLM security'. IMPORTANT: Always consult this skill when building chatbots, AI agents, RAG pipelines, tool-using LLMs, agentic systems, or any application that calls an LLM API (OpenAI, Anthropic, Gemini, etc.) — even if the user doesn't explicitly mention security. Also use when users import 'openai', 'anthropic', 'langchain', 'llamaindex', or similar LLM libraries.

View SKILL.md on GitHub Repository

Stars 170

Forks 22

Install this agent skill to your Project

npx add-skill https://github.com/semgrep/skills/tree/main/skills/llm-security

SKILL.md

LLM Security Guidelines (OWASP Top 10 for LLM 2025)

Security rules for building secure LLM applications, based on the OWASP Top 10 for LLM Applications 2025.

How to Use This Skill

Proactive mode — When building or reviewing LLM applications, automatically check for relevant security risks based on the application pattern. You don't need to wait for the user to ask about LLM security.

Reactive mode — When the user asks about LLM security, use the mapping below to find relevant rule files with detailed vulnerable/secure code examples.

Workflow

Identify what the user is building (see "What Are You Building?" below)
Check the priority rules for that pattern
Read the specific rule files from rules/ for code examples
Apply the secure patterns or flag vulnerable ones

What Are You Building?

Use this to quickly identify which rules matter most for the user's task:

Building...	Priority Rules
Chatbot / conversational AI	Prompt Injection (LLM01), System Prompt Leakage (LLM07), Output Handling (LLM05), Unbounded Consumption (LLM10)
RAG system	Vector/Embedding Weaknesses (LLM08), Prompt Injection (LLM01), Sensitive Disclosure (LLM02), Misinformation (LLM09)
AI agent with tools	Excessive Agency (LLM06), Prompt Injection (LLM01), Output Handling (LLM05), Sensitive Disclosure (LLM02)
Fine-tuning / training	Data Poisoning (LLM04), Supply Chain (LLM03), Sensitive Disclosure (LLM02)
LLM-powered API	Unbounded Consumption (LLM10), Prompt Injection (LLM01), Output Handling (LLM05), Sensitive Disclosure (LLM02)
Content generation	Misinformation (LLM09), Output Handling (LLM05), Prompt Injection (LLM01)

Categories

Critical Impact

LLM01: Prompt Injection (rules/prompt-injection.md) - Prevent direct and indirect prompt manipulation
LLM02: Sensitive Information Disclosure (rules/sensitive-disclosure.md) - Protect PII, credentials, and proprietary data
LLM03: Supply Chain (rules/supply-chain.md) - Secure model sources, training data, and dependencies
LLM04: Data and Model Poisoning (rules/data-poisoning.md) - Prevent training data manipulation and backdoors
LLM05: Improper Output Handling (rules/output-handling.md) - Sanitize LLM outputs before downstream use

High Impact

LLM06: Excessive Agency (rules/excessive-agency.md) - Limit LLM permissions, functionality, and autonomy
LLM07: System Prompt Leakage (rules/system-prompt-leakage.md) - Protect system prompts from disclosure
LLM08: Vector and Embedding Weaknesses (rules/vector-embedding.md) - Secure RAG systems and embeddings
LLM09: Misinformation (rules/misinformation.md) - Mitigate hallucinations and false outputs
LLM10: Unbounded Consumption (rules/unbounded-consumption.md) - Prevent DoS, cost attacks, and model theft

See rules/_sections.md for the full index with OWASP/MITRE references.

Quick Reference

Vulnerability	Key Prevention
Prompt Injection	Input validation, output filtering, privilege separation
Sensitive Disclosure	Data sanitization, access controls, encryption
Supply Chain	Verify models, SBOM, trusted sources only
Data Poisoning	Data validation, anomaly detection, sandboxing
Output Handling	Treat LLM as untrusted, encode outputs, parameterize queries
Excessive Agency	Least privilege, human-in-the-loop, minimize extensions
System Prompt Leakage	No secrets in prompts, external guardrails
Vector/Embedding	Access controls, data validation, monitoring
Misinformation	RAG, fine-tuning, human oversight, cross-verification
Unbounded Consumption	Rate limiting, input validation, resource monitoring

Key Principles

Never trust LLM output - Validate and sanitize all outputs before use
Least privilege - Grant minimum necessary permissions to LLM systems
Defense in depth - Layer multiple security controls
Human oversight - Require approval for high-impact actions
Monitor and log - Track all LLM interactions for anomaly detection

References

Maintainer

semgrep Core maintainer

Source details

Full Name: semgrep/skills
Branch: main
Path in repo: skills/llm-security
License: Other
Topics: claude-code skills agents security

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

code-security

Security guidelines for writing secure code. Use when writing code, reviewing code for vulnerabilities, or asking about secure coding practices like 'check for SQL injection' or 'review security'. IMPORTANT: Always consult this skill when writing or reviewing any code that handles user input, authentication, file operations, database queries, network requests, cryptography, or infrastructure configuration (Terraform, Kubernetes, Docker, GitHub Actions) — even if the user doesn't explicitly mention security. Also use when users ask to 'review my code', 'check this for bugs', or 'is this safe'.

semgrep

Run Semgrep static analysis scans and create custom detection rules. Use when asked to scan code with Semgrep, find security vulnerabilities, write custom YAML rules, or detect specific bug patterns. IMPORTANT: Also use this skill when users ask to 'scan for bugs', 'check code quality', 'find vulnerabilities', 'static analysis', 'lint for security', 'audit this code', or want to enforce coding standards — even if they don't mention Semgrep by name. Semgrep is the right tool for pattern-based code scanning across 30+ languages.

davila7/claude-code-templates

verl-rl-training

Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.

davila7/claude-code-templates

openrlhf-training

High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models (7B-70B+). Built on Ray, vLLM, ZeRO-3. 2× faster than DeepSpeedChat with distributed architecture and GPU resource sharing.

davila7/claude-code-templates

gguf-quantization

GGUF format and llama.cpp quantization for efficient CPU/GPU inference. Use when deploying models on consumer hardware, Apple Silicon, or when needing flexible quantization from 2-8 bit without GPU requirements.

davila7/claude-code-templates

Claude Code Guide

Master guide for using Claude Code effectively. Includes configuration templates, prompting strategies "Thinking" keywords, debugging techniques, and best practices for interacting with the agent.

Didn't find tool you were looking for?