Agent skill
review-artifacts
Review artifact scan results for reportable bug bounty findings. Analyzes archives, SQL dumps, binary databases, and source backups for secrets, code vulnerabilities, misconfigurations, and PII exposure. Focuses on high-confidence findings with clear security impact.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/review-artifacts
SKILL.md
Review Artifact Findings
Triage artifact scan results to find reportable bug bounty vulnerabilities. Artifacts are files that automated scanners (Trufflehog, Semgrep) cannot process directly - they require extraction or manual analysis.
Goal: Find high-confidence findings with verified exploitability and clear security impact.
Project Structure
All paths are relative to the project root (working directory):
threat_hunting/ # Project root (working directory)
├── <org-name>/ # Cloned repositories (e.g., jitsi/, tronprotocol/)
│ └── <repo-name>/ # Individual repository source code
├── findings/<org-name>/ # All scan results for an organization
│ ├── semgrep-results/
│ ├── trufflehog-results/
│ ├── artifact-results/ # Artifact scan JSON output
│ │ └── <repo-name>.json
│ ├── kics-results/
│ └── reports/ # Final consolidated reports
└── scripts/ # ALL extraction and scanning scripts
Repository source code location: <org-name>/<repo-name>/ (e.g., jitsi/jicofo/src/main/java/...)
Scan results location: findings/<org-name>/artifact-results/<repo-name>.json
CRITICAL: Do NOT Write Custom Scripts
All extraction scripts already exist in ./scripts/. Never write custom jq, Python, or shell scripts to parse findings. The existing scripts handle:
- Complex JSON/NDJSON parsing
- Large file handling
- Edge cases and error handling
- Consistent output formatting
Available extraction scripts:
./scripts/extract-semgrep-findings.sh- Parse semgrep results./scripts/extract-trufflehog-findings.sh- Parse trufflehog results./scripts/extract-artifact-findings.sh- Parse artifact results./scripts/extract-kics-findings.sh- Parse KICS results./scripts/extract-and-scan-archives.sh- Extract archives and scan for secrets./scripts/safe-extract-archive.sh- Safely extract individual archives
If you need functionality not provided by existing scripts, ask the user to update the scripts rather than writing one-off solutions.
Quick Start
# Extract from findings/ directory (per-repo files)
./scripts/extract-artifact-findings.sh <org-name> # All repos, summary
./scripts/extract-artifact-findings.sh <org-name> archives # Just archives
./scripts/extract-artifact-findings.sh <org-name> sql # Just SQL dumps
# Extract from catalog scans (merged gzipped files)
./scripts/extract-artifact-findings.sh <org-name> --catalog # Latest scan
./scripts/extract-artifact-findings.sh <org-name> --scan 2025-12-24 # Specific scan
# Extract archives and scan for secrets
./scripts/extract-and-scan-archives.sh <org-name>
Data Sources:
findings/<org>/artifact-results/*.json- Per-repo results (uncompressed)catalog/tracked/<org>/scans/<timestamp>/artifacts.json.gz- Merged scan (gzipped)
Workflow
Step 1: Discover Artifacts and Verify Counts
Run the extraction script with count format first:
# Step 1a: Get counts to verify totals
./scripts/extract-artifact-findings.sh <org-name> count
# Step 1b: Get full summary
./scripts/extract-artifact-findings.sh <org-name>
CRITICAL COUNT VERIFICATION: The summary output shows totals at the bottom. These MUST match the sums from step 1a. If they don't match, the extraction may be truncating results - investigate before proceeding.
This categorizes artifacts by type:
- Archives - Need extraction before scanning
- SQL dumps - May contain PII (marked
[CONTAINS DATA]if they have actual records) - Binary databases - SQLite, etc. requiring manual inspection
- Source backups -
.bak,.oldfiles that may reveal past vulnerabilities
Step 2: Extract and Scan Archives
CRITICAL: Always use safe extraction - NEVER extract manually!
# Scan all archives for secrets
./scripts/extract-and-scan-archives.sh <org-name>
# Or extract a single archive for manual review
./scripts/safe-extract-archive.sh <archive-path> [output-dir]
Safe extraction protects against:
- Path traversal (zip-slip) attacks
- Symlink/hardlink attacks
- Decompression bombs (size limits enforced)
Step 3: Analyze for Vulnerabilities
After secret scanning, review extracted content for:
- Code vulnerabilities - Injection, auth bypass, dangerous functions
- Misconfigurations - Privileged containers, overly permissive access
- PII exposure - Real user data in dumps
- Architectural intel - Internal endpoints, attack surface details
Step 4: Assess Reportability
For each finding, evaluate:
- Is it exploitable? - Can you demonstrate the impact?
- Is it in production code? - Not test fixtures or examples
- Is it in scope? - Check the bug bounty program policy
- Is it high confidence? - Clear security impact, not theoretical
Step 5: Document Findings
Use the templates below for reportable findings.
Analysis by Artifact Type
Archives
Archives may contain secrets, vulnerable code, or sensitive configurations.
High-Risk Indicators
- Names:
backup,prod,production,deploy,config - Location:
deploy/,infrastructure/,scripts/ - Size: Large archives may contain full codebases or database dumps
Low-Risk (Often Skip)
- Test/sample data:
test/,fixtures/,samples/ - Asset bundles: images, fonts, icons
- Vendored dependencies (report upstream instead)
What to Look For
Secrets (via Trufflehog - automatic):
- API keys, tokens, credentials
- Private keys (SSH, TLS)
- Database connection strings
Kubernetes/Helm Misconfigurations:
# Privileged containers (container escape risk)
grep -r "privileged: true" <dir>
grep -rE "host(Network|PID|IPC): true" <dir>
# Running as root
grep -r "runAsUser: 0" <dir>
grep -r "runAsNonRoot: false" <dir>
# Overly permissive RBAC
grep -rE "cluster-admin" <dir>
grep -rE "verbs:.*\*" <dir>
# Exposed services
grep -rE "type: (LoadBalancer|NodePort)" <dir>
Infrastructure Misconfigurations:
# Public cloud resources
grep -rE "acl.*public" <dir>
grep -rE "0\.0\.0\.0/0" <dir>
# Disabled encryption
grep -rE "encrypted.*false" <dir>
Code Vulnerabilities:
# Command injection
grep -rnE "(exec|system|popen|subprocess|shell_exec|eval)\s*\(" <dir>
# Deserialization
grep -rnE "(pickle\.load|yaml\.load|unserialize|readObject)" <dir>
# SQL injection patterns
grep -rn "SELECT.*\+\|INSERT.*\+\|UPDATE.*\+" <dir>
# Debug/admin endpoints
grep -rnE "(debug.*true|DEBUG.*=.*1|/debug/|/admin/)" <dir>
Architectural Intelligence:
# Internal hostnames
grep -rE "\.internal\.|\.local\.|\.corp\." <dir>
# Database connection strings (even without creds - reveals topology)
grep -rE "(mysql|postgres|mongodb|redis)://" <dir>
# API endpoints
grep -rE "/api/v[0-9]|/internal/" <dir>
SQL Dumps
SQL dumps with [CONTAINS DATA] have INSERT/COPY statements - real data.
Reportable Findings
PII Exposure (CRITICAL if real user data):
# Sensitive table names
grep -iE 'CREATE TABLE.*(user|customer|account|payment|order)' <file.sql>
# PII columns
grep -iE '(email|password|ssn|phone|address|credit_card)' <file.sql>
# Sample the data
grep -A5 'INSERT INTO' <file.sql> | head -50
What makes it reportable:
- Real user data (not obviously fake like
test@example.com) - Password hashes (even hashed passwords are sensitive)
- Payment/financial information
- Health/medical data
Likely false positive:
- Schema-only dumps (no INSERT statements)
- Clearly fake data (
John Doe,123 Main St, sequential IDs) - Files in
test/,fixtures/,samples/
Binary Databases
SQLite and other binary databases need manual inspection.
# List tables
sqlite3 <file.db> ".tables"
# Show schema
sqlite3 <file.db> ".schema"
# Look for sensitive tables
sqlite3 <file.db> ".tables" | grep -iE '(user|account|session|token|key|secret|cred)'
# Sample data
sqlite3 <file.db> "SELECT * FROM <table> LIMIT 5;"
What to look for:
- Session tokens or API keys
- User credentials
- Cached sensitive data
- Application secrets
Source Code Backups
Files like .php.bak, .py.old, .env.bak may reveal:
- Removed secrets - Credentials deleted from current version
- Fixed vulnerabilities - Bugs that were patched (check if fix is complete)
- Debug code - Logging passwords, disabled auth checks
# Compare backup to current file
diff file.php file.php.bak
# Search for secrets
grep -n 'password\|secret\|key\|token\|api_key' file.bak
# Search for vulnerabilities
grep -n 'eval\|exec\|system\|shell_exec' file.bak
High-risk backups:
.env.bak,.env.production.old- Environment filesconfig.php.old,settings.py.bak- Configuration filesauth.php.bak,login.py.old- Authentication code
Reportability Assessment
Directly Reportable
| Finding Type | Severity | Requirements |
|---|---|---|
| Verified active secret | CRITICAL | Confirmed working (API responds, login succeeds) |
| Real PII in SQL dump | CRITICAL | Actual user data, not test fixtures |
| Privileged container + hostNetwork | HIGH | In production Helm chart, not examples |
| Command injection in code | HIGH | Reachable code path, not dead code |
Requires Further Investigation
| Finding Type | Next Steps |
|---|---|
| Internal hostnames discovered | Check if they resolve externally, test for SSRF |
| Unverified secrets | Attempt to use them, check if rotated |
| Misconfiguration in Helm chart | Verify it's deployed, not just in repo |
| Architectural intel | Use to inform testing of main application |
Likely Not Reportable
- Secrets in test fixtures or example code
- Schema-only SQL dumps
- Misconfigurations in vendored dependencies
- Findings in archived/deprecated code that's no longer deployed
Documentation Templates
Secret in Archive
## SECRET EXPOSURE - [Type]
**Repository**: org/repo-name
**Archive**: path/to/archive.tgz
**File**: extracted/path/to/file
**Secret Type**: AWS Access Key / API Token / Database Password / etc.
**Verified**: Yes/No (describe verification)
**Impact**:
- What access does this secret provide?
- What data/systems are at risk?
**Reproduction**:
1. Extract archive: `./scripts/safe-extract-archive.sh <path>`
2. Secret location: `<file>:<line>`
3. Verification: `<command or steps>`
**Recommendation**: Rotate immediately, remove from repository history
PII Exposure
## PII EXPOSURE - SQL Dump
**Repository**: org/repo-name
**File**: path/to/dump.sql
**Size**: X MB
**Data Exposed**:
- Table: users (X records)
- Columns: email, password_hash, phone, address
- Table: payments (X records)
- Columns: credit_card_last4, billing_address
**Real vs Test Data**: [Evidence this is real data]
**Impact**: X user records exposed including [specific PII types]
**Recommendation**:
1. Remove from repository and git history
2. Assess if breach notification required
3. Force password reset if credentials exposed
Code Vulnerability in Archive
## CODE VULNERABILITY - [Type]
**Repository**: org/repo-name
**Archive**: path/to/archive.tgz
**Location**: extracted/file.py:45
**Vulnerability**: [Type - e.g., Command Injection, Privileged Container]
**Details**:
[Explain the vulnerability]
**Exploitability**:
- Is this code deployed/reachable?
- What's required to exploit?
**Impact**: [What can an attacker do?]
**Recommendation**: [Specific fix]
Architectural Intelligence
## ARCHITECTURAL DISCOVERY
**Repository**: org/repo-name
**Source**: path/to/config
**Discovered**:
- Internal endpoints: [list]
- Service topology: [description]
- Authentication mechanism: [details]
**Security Relevance**:
[How this informs further testing]
**Follow-up Actions**:
- [ ] Test endpoint X for vulnerability Y
- [ ] Check if internal hostname resolves externally
False Positive Indicators
Skip these:
- Files in
test/,fixtures/,testdata/,samples/,examples/ - Files with
example,sample,demo,dummy,mockin name - Vendored/third-party code (report upstream)
- Schema-only SQL (no INSERT/COPY statements)
- Obviously fake data (
test@example.com,password123)
Investigate these:
- Files in
config/,deploy/,scripts/,backup/,infrastructure/ - Files with
prod,production,livein name - SQL dumps marked
[CONTAINS DATA] - Environment file backups (
.env.*) - Large files (>1MB may contain real data)
Reference
Safe Extraction Commands
# Extract and scan all archives (preferred)
./scripts/extract-and-scan-archives.sh <org-name>
# Extract single archive
./scripts/safe-extract-archive.sh <archive-path>
# Extract to specific directory
./scripts/safe-extract-archive.sh <archive-path> <output-dir>
# Adjust limits for large archives
SAFE_EXTRACT_MAX_ARCHIVE_SIZE=209715200 \
SAFE_EXTRACT_MAX_EXTRACTED_SIZE=1073741824 \
./scripts/safe-extract-archive.sh <archive-path>
Extraction Script Options
# Summary view (default)
./scripts/extract-artifact-findings.sh <org>
# Filter by type
./scripts/extract-artifact-findings.sh <org> archives
./scripts/extract-artifact-findings.sh <org> sql
./scripts/extract-artifact-findings.sh <org> sources
# Full JSON
./scripts/extract-artifact-findings.sh <org> full
Didn't find tool you were looking for?