Agent skill

audit-augmentation

Augments Trailmark code graphs with external audit findings from SARIF static analysis results and weAudit annotation files. Maps findings to graph nodes by file and line overlap, creates severity-based subgraphs, and enables cross-referencing findings with pre-analysis data (blast radius, taint, etc.). Use when projecting SARIF results onto a code graph, overlaying weAudit annotations, cross-referencing Semgrep or CodeQL findings with call graph data, or visualizing audit findings in the context of code structure.

Stars 4,181
Forks 369

Install this agent skill to your Project

npx add-skill https://github.com/trailofbits/skills/tree/main/plugins/trailmark/skills/audit-augmentation

SKILL.md

Audit Augmentation

Projects findings from external tools (SARIF) and human auditors (weAudit) onto Trailmark code graphs as annotations and subgraphs.

When to Use

  • Importing Semgrep, CodeQL, or other SARIF-producing tool results into a graph
  • Importing weAudit audit annotations into a graph
  • Cross-referencing static analysis findings with blast radius or taint data
  • Querying which functions have high-severity findings
  • Visualizing audit coverage alongside code structure

When NOT to Use

  • Running static analysis tools (use semgrep/codeql directly, then import)
  • Building the code graph itself (use the trailmark skill)
  • Generating diagrams (use the diagramming-code skill after augmenting)

Rationalizations to Reject

Rationalization Why It's Wrong Required Action
"The user only asked about SARIF, skip pre-analysis" Without pre-analysis, you can't cross-reference findings with blast radius or taint Always run engine.preanalysis() before augmenting
"Unmatched findings don't matter" Unmatched findings may indicate parsing gaps or out-of-scope files Report unmatched count and investigate if high
"One severity subgraph is enough" Different severities need different triage workflows Query all severity subgraphs, not just error
"SARIF results speak for themselves" Findings without graph context lack blast radius and taint reachability Cross-reference with pre-analysis subgraphs
"weAudit and SARIF overlap, pick one" Human auditors and tools find different things Import both when available
"Tool isn't installed, I'll do it manually" Manual analysis misses what tooling catches Install trailmark first

Installation

MANDATORY: If uv run trailmark fails, install trailmark first:

bash
uv pip install trailmark

Quick Start

CLI

bash
# Augment with SARIF
uv run trailmark augment {targetDir} --sarif results.sarif

# Augment with weAudit
uv run trailmark augment {targetDir} --weaudit .vscode/alice.weaudit

# Both at once, output JSON
uv run trailmark augment {targetDir} \
    --sarif results.sarif \
    --weaudit .vscode/alice.weaudit \
    --json

Programmatic API

python
from trailmark.query.api import QueryEngine

engine = QueryEngine.from_directory("{targetDir}", language="python")

# Run pre-analysis first for cross-referencing
engine.preanalysis()

# Augment with SARIF
result = engine.augment_sarif("results.sarif")
# result: {matched_findings: 12, unmatched_findings: 3, subgraphs_created: [...]}

# Augment with weAudit
result = engine.augment_weaudit(".vscode/alice.weaudit")

# Query findings
engine.findings()                                       # All findings
engine.subgraph("sarif:error")                          # High-severity SARIF
engine.subgraph("weaudit:high")                         # High-severity weAudit
engine.subgraph("sarif:semgrep")                        # By tool name
engine.annotations_of("function_name")                  # Per-node lookup

Workflow

Augmentation Progress:
- [ ] Step 1: Build graph and run pre-analysis
- [ ] Step 2: Locate SARIF/weAudit files
- [ ] Step 3: Run augmentation
- [ ] Step 4: Inspect results and subgraphs
- [ ] Step 5: Cross-reference with pre-analysis

Step 1: Build the graph and run pre-analysis for blast radius and taint context:

python
engine = QueryEngine.from_directory("{targetDir}", language="{lang}")
engine.preanalysis()

Step 2: Locate input files:

  • SARIF: Usually output by tools like semgrep --sarif -o results.sarif or codeql database analyze --format=sarif-latest
  • weAudit: Stored in .vscode/<username>.weaudit within the workspace

Step 3: Run augmentation via engine.augment_sarif() or engine.augment_weaudit(). Check unmatched_findings in the result — these are findings whose file/line locations didn't overlap any parsed code unit.

Step 4: Query findings and subgraphs. Use engine.findings() to list all annotated nodes. Use engine.subgraph_names() to see available subgraphs.

Step 5: Cross-reference with pre-analysis data to prioritize:

  • Findings on tainted nodes: overlap sarif:error with tainted subgraph
  • Findings on high blast radius nodes: overlap with high_blast_radius
  • Findings on privilege boundaries: overlap with privilege_boundary

Annotation Format

Findings are stored as standard Trailmark annotations:

  • Kind: finding (tool-generated) or audit_note (human notes)
  • Source: sarif:<tool_name> or weaudit:<author>
  • Description: Compact single-line: [SEVERITY] rule-id: message (tool)

Subgraphs Created

Subgraph Contents
sarif:error Nodes with SARIF error-level findings
sarif:warning Nodes with SARIF warning-level findings
sarif:note Nodes with SARIF note-level findings
sarif:<tool> Nodes flagged by a specific tool
weaudit:high Nodes with high-severity weAudit findings
weaudit:medium Nodes with medium-severity weAudit findings
weaudit:low Nodes with low-severity weAudit findings
weaudit:findings All weAudit findings (entryType=0)
weaudit:notes All weAudit notes (entryType=1)

How Matching Works

Findings are matched to graph nodes by file path and line range overlap:

  1. Finding file path is normalized relative to the graph's root_path
  2. Nodes whose location.file_path matches AND whose line range overlaps are selected
  3. The tightest match (smallest span) is preferred
  4. If a finding's location doesn't overlap any node, it counts as unmatched

SARIF paths may be relative, absolute, or file:// URIs — all are handled. weAudit uses 0-indexed lines which are converted to 1-indexed automatically.

Supporting Documentation

  • references/formats.md — SARIF 2.1.0 and weAudit file format field reference

Expand your agent's capabilities with these related and highly-rated skills.

trailofbits/skills

gh-cli

Enforces authenticated gh CLI workflows over unauthenticated curl/WebFetch patterns. Use when working with GitHub URLs, API access, pull requests, or issues.

4,181 369
Explore
trailofbits/skills

supply-chain-risk-auditor

Identifies dependencies at heightened risk of exploitation or takeover. Use when assessing supply chain attack surface, evaluating dependency health, or scoping security engagements.

4,181 369
Explore
trailofbits/skills

zeroize-audit

Detects missing zeroization of sensitive data in source code and identifies zeroization removed by compiler optimizations, with assembly-level analysis, and control-flow verification. Use for auditing C/C++/Rust code handling secrets, keys, passwords, or other sensitive data.

4,181 369
Explore
trailofbits/skills

sharp-edges

Identifies error-prone APIs, dangerous configurations, and footgun designs that enable security mistakes. Use when reviewing API designs, configuration schemas, cryptographic library ergonomics, or evaluating whether code follows 'secure by default' and 'pit of success' principles. Triggers: footgun, misuse-resistant, secure defaults, API usability, dangerous configuration.

4,181 369
Explore
trailofbits/skills

insecure-defaults

Detects fail-open insecure defaults (hardcoded secrets, weak auth, permissive security) that allow apps to run insecurely in production. Use when auditing security, reviewing config management, or analyzing environment variable handling.

4,181 369
Explore
trailofbits/skills

dwarf-expert

Provides expertise for analyzing DWARF debug files and understanding the DWARF debug format/standard (v3-v5). Triggers when understanding DWARF information, interacting with DWARF files, answering DWARF-related questions, or working with code that parses DWARF data.

4,181 369
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results