Agent skill
arkime-data-manager
Manage Arkime data (wipe database, import PCAPs)
Install this agent skill to your Project
npx add-skill https://github.com/goodluckz/claude-code-config/tree/main/skills/arkime-data-manager
SKILL.md
Arkime Data Manager
This skill provides commands and workflows for managing Arkime data: wiping the database and importing PCAP files.
Environments
| Property | ENV-41 (Local) | ENV-42 (Remote) |
|---|---|---|
| Host | localhost | dns-analy4 |
| Container | env-41_arkime |
env-42_arkime |
| Arkime Port | 8041 | 8042 |
| OpenSearch | 4-node (env-41_os01:9200) |
3-node (env-42_os01:9200) |
| Project Dir | /Users/zhaoliang/LocalRepos/envs/ENV-41/ |
/export0/Work/zhaoliang/ENV-42/ (remote) |
| Local Repo | same as above | /Users/zhaoliang/LocalRepos/envs/ENV-42/ |
| PCAP Dir (host) | ./env-41_pcap_dir/ |
/export0/Work/zhaoliang/env-42_pcap_dir/ |
| PCAP Dir (container) | /data/pcap |
/data/pcap |
| Config/Logs mount | ./env-41_arkime_config → /data/config, ./env-41_arkime_logs → /data/logs |
same dirs |
| Credentials | admin/admin (Digest Auth) | admin/admin (Digest Auth) |
Prerequisites
- Docker: ENV-41 runs locally, ENV-42 runs on dns-analy4
- SSH:
ssh dns-analy4for remote ENV-42 operations - Arkime Container:
env-41_arkime(local) orenv-42_arkime(remote)
Database Operations
Wipe Database (Clear All Data)
⚠️ WARNING: This operation permanently deletes all session data from OpenSearch. PCAP files on disk are NOT deleted.
Prerequisites for Wipe
- Stop all captures and viewers (recommended but not required for dev environments)
- Backup data first (optional but recommended):
bash
docker exec env-41_arkime /opt/arkime/db/db.pl http://env-41_os01:9200 backup
Using db.pl wipe
Location: /opt/arkime/db/db.pl (inside Arkime container)
Syntax:
echo "WIPE" | docker exec -i env-41_arkime /opt/arkime/db/db.pl http://env-41_os01:9200 wipe
Important Notes:
- Must use
echo "WIPE" |to provide confirmation input - Must use
-iflag withdocker execto enable stdin - Use
env-41_os01:9200(container network name) NOTlocalhost:9200when running from inside container - Operation erases all indices and recreates them (returns to initial state)
- Does NOT delete PCAP files from disk
- Does NOT delete user accounts/settings
Example Output:
It is STRONGLY recommended that you stop ALL Arkime captures and viewers before proceeding.
Use 'db.pl http://env-41_os01:9200 backup' to backup db first.
There are 4 OpenSearch/Elasticsearch data nodes, if you expect more please fix first before proceeding.
This will delete ALL session data in OpenSearch/Elasticsearch! (It does not delete the pcap files on disk or user info.)
Type "WIPE" to continue - do you want to wipe everything??
Erasing
Creating
Finished
Verify Data Cleared
Check session count after wipe:
curl -s --digest -u admin:admin 'http://localhost:8041/api/sessions?length=0' | jq '{recordsTotal, recordsFiltered}'
Expected output:
{
"recordsTotal": 0,
"recordsFiltered": 0
}
Check OpenSearch Cluster Health
Before running wipe, verify cluster is healthy:
curl -s -u admin:admin http://localhost:9200/_cluster/health | jq '.status'
Expected: "green" or "yellow" (avoid wiping if "red")
PCAP Import Operations
Import PCAPs with Automatic Tagging
Script: /Users/zhaoliang/LocalRepos/envs/ENV-41/scripts/import-env41-samples.sh
This script imports PCAPs from organized subdirectories and automatically applies tags based on directory names.
Directory Naming Convention
Format: {dataset}-{ip-version}-{protocol}-{region}
Examples:
aws-v4-scan-tokyo/→ Tags:dataset:aws,ip-version:v4,protocol:scan,region:tokyoaws-v6-ntp-california/→ Tags:dataset:aws,ip-version:v6,protocol:ntp,region:californiahoneynet-v6-ntp-tokyo/→ Tags:dataset:honeynet,ip-version:v6,protocol:ntp,region:tokyo
Tag Components:
- dataset:
aws,honeynet - ip-version:
v4,v6(also acceptsipv4,ipv6) - protocol:
ntp,scan - region:
tokyo,sydney,virginia,california,mumbai,saopaulo,frankfurt
Basic Usage
# Import all PCAPs from all subdirectories
cd /Users/zhaoliang/LocalRepos/envs/ENV-41
bash scripts/import-env41-samples.sh
Filtered Import
# Import only AWS NTP data
bash scripts/import-env41-samples.sh -d aws -p ntp
# Import only Tokyo region data
bash scripts/import-env41-samples.sh -r tokyo
# Import only IPv6 data
bash scripts/import-env41-samples.sh -i v6
# Import specific combination: AWS IPv6 NTP from California
bash scripts/import-env41-samples.sh -d aws -i v6 -p ntp -r california
Dry Run (Preview Without Importing)
bash scripts/import-env41-samples.sh -n
This shows what would be imported without actually processing files.
Script Options
-d, --dataset Dataset filter: aws, honeynet, or all (default: all)
-r, --region Region filter: tokyo, sydney, virginia, california, mumbai, saopaulo, frankfurt, or all (default: all)
-p, --protocol Protocol filter: ntp, scan, or all (default: all)
-i, --ip-version IP version filter: v4, v6, or all (default: all)
-n, --dry-run Show what would be imported without actually importing
-h, --help Show help message
How Import Works
- Scans PCAP directory for subdirectories matching naming pattern
- Parses directory names to extract tag metadata
- Filters directories based on command-line options
- Counts PCAP files in each directory
- Imports using batch mode:
docker exec env-41_arkime /opt/arkime/bin/capture --pcapdir /data/pcap/subdir --recursive -t "tags" - Reports success/failure for each directory
Verify Import Success
Check total sessions:
curl -s --digest -u admin:admin 'http://localhost:8041/api/sessions?length=0' | jq '.recordsTotal'
Check sessions by tag:
# Check specific dataset
curl -s --digest -u admin:admin \
'http://localhost:8041/api/sessions?expression=tags%20%3D%3D%20dataset%3Aaws&length=0' \
| jq '.recordsTotal'
# Check specific experiment
curl -s --digest -u admin:admin \
'http://localhost:8041/api/sessions?expression=tags%20%3D%3D%20region%3Atokyo&length=0' \
| jq '.recordsTotal'
Complete Workflow: Wipe and Reimport
Use Case: Reset ENV-41 to clean state with fresh data import
Step 1: Start ENV-41 Cluster
cd /Users/zhaoliang/LocalRepos/envs/ENV-41
docker-compose up -d
Wait for OpenSearch cluster to become healthy (~30-60 seconds):
# Check cluster health (wait for "green")
curl -s -u admin:admin http://localhost:9200/_cluster/health | jq '.status'
Step 2: Wipe Existing Data
echo "WIPE" | docker exec -i env-41_arkime /opt/arkime/db/db.pl http://env-41_os01:9200 wipe
Expected output ends with:
Erasing
Creating
Finished
Step 3: Verify Data Cleared
curl -s --digest -u admin:admin 'http://localhost:8041/api/sessions?length=0' | jq '.recordsTotal'
Expected: 0
Step 4: Import Fresh Data
bash scripts/import-env41-samples.sh
Or with filters (e.g., only AWS data):
bash scripts/import-env41-samples.sh -d aws
Step 5: Verify Import
# Check total sessions imported
curl -s --digest -u admin:admin 'http://localhost:8041/api/sessions?length=0' | jq '.recordsTotal'
# Should show a number > 0 (e.g., 4278218 for full ENV-41 dataset)
Troubleshooting
Wipe Command Fails with "Couldn't GET http://localhost:9200"
Error: Couldn't GET http://localhost:9200/_cluster/health the http status code is 500
Cause: OpenSearch cluster not ready or using wrong endpoint
Solutions:
-
Wait for cluster to become healthy:
bashcurl -s -u admin:admin http://localhost:9200/_cluster/health | jq '.status'Wait until status is
"green"(not"red") -
Use correct endpoint from container:
- ✅ Correct:
http://env-41_os01:9200(from inside arkime container) - ❌ Wrong:
http://localhost:9200(not reachable from inside container)
- ✅ Correct:
Import Script Hangs or Shows Empty Tags
Symptom: Tags show as dataset:,ip-version:,protocol:,region: (empty values)
Cause: Directory name doesn't match expected pattern
Expected Pattern: {dataset}-{ip-version}-{protocol}-{region}
- Example:
aws-v6-ntp-tokyo,honeynet-v4-scan-sydney
Solutions:
- Rename directories to match pattern
- Skip non-matching directories (script will import them with empty tags but continue)
- Directories like
aws/,honeynet/,aws-IPv4/don't match pattern but will still be imported
Docker Container Not Running
Error: Error response from daemon: Container env-41_arkime is not running
Solution:
cd /Users/zhaoliang/LocalRepos/envs/ENV-41
docker-compose up -d
Permission Denied Accessing PCAP Files
Cause: PCAP directory not properly mounted in container
Solution: Check docker-compose.yml volume mount:
volumes:
- ./env-41_pcap_dir:/data/pcap
Ensure PCAP files are in /Users/zhaoliang/LocalRepos/envs/ENV-41/env-41_pcap_dir/
Environment Configuration
ENV-41 Directory: /Users/zhaoliang/LocalRepos/envs/ENV-41/
Key Files:
docker-compose.yml: Service configuration.env: Environment variables (ports, node names, paths)scripts/import-env41-samples.sh: PCAP import scriptenv-41_pcap_dir/: PCAP files organized by subdirectory
Services:
- Arkime:
env-41_arkime(port 8041) - OpenSearch Nodes:
env-41_os01,env-41_os02,env-41_os03,env-41_os04- OS01 exposed on localhost:9200 (REST API), localhost:9600 (Performance Analyzer)
Credentials:
- Arkime Web UI: admin/admin (HTTP Digest Auth)
- OpenSearch: admin/admin (when security enabled)
ENV-42 Remote PCAP Import (Background)
ENV-42 runs on dns-analy4. All commands are executed via SSH. The import runs in the background inside the container using docker exec -d.
PCAP Subdirectories
| Directory | Content | Files |
|---|---|---|
darknet-nii-202511 |
Darknet traffic | ~30 |
honeynet-v6-ntp-tokyo |
Honeynet NTP | ~540 |
honeynet-v6-scan-tokyo |
Honeynet scan | ~1004 |
Import Scripts
1. Batch import with progress tracking (scripts/import-with-progress.sh):
- Processes each subdirectory sequentially with timing
- Uses
-qto suppress verbose capture output - Uses
--skipto avoid re-importing already imported files - Logs to
/data/logs/import_progress.log
2. Tagged import (scripts/import-env42-pcaps.sh):
- Auto-tags based on directory naming convention
{dataset}-{ipversion}-{protocol}-{region} - Supports filters:
-d dataset,-r region,-p protocol,-i ip-version - Supports dry run:
-n
Start Background Import
# Copy script into container (scripts dir is NOT volume-mounted)
ssh dns-analy4 "docker cp /export0/Work/zhaoliang/ENV-42/scripts/import-with-progress.sh env-42_arkime:/tmp/import-with-progress.sh"
# Start in detached mode (survives SSH disconnect)
ssh dns-analy4 "docker exec -d env-42_arkime bash /tmp/import-with-progress.sh"
Monitor Progress
# View progress log
ssh dns-analy4 "docker exec env-42_arkime tail -10 /data/logs/import_progress.log"
# Check total imported file count
ssh dns-analy4 'curl -s --digest -u admin:admin "http://localhost:8042/api/files?length=1" | python3 -c "import json,sys; print(json.load(sys.stdin)[\"recordsTotal\"])"'
# Check if capture process is still running
ssh dns-analy4 "docker exec env-42_arkime pgrep -f 'capture.*pcapdir'"
# Check total session count
ssh dns-analy4 'curl -s --digest -u admin:admin "http://localhost:8042/api/sessions?length=0" | python3 -c "import json,sys; print(json.load(sys.stdin)[\"recordsTotal\"])"'
Wipe ENV-42 Database
ssh dns-analy4 'echo "WIPE" | docker exec -i env-42_arkime /opt/arkime/db/db.pl http://env-42_os01:9200 wipe'
Volume Mounts (ENV-42)
The scripts/ directory is NOT mounted into the container. Only these paths are mounted:
./env-41_arkime_config→/data/config./env-41_arkime_logs→/data/logs(root-owned)$PCAP_DIR(/export0/Work/zhaoliang/env-42_pcap_dir) →/data/pcap./package→/data/package
To get scripts into the container, use docker cp.
Sync Remote Changes to Local
rsync -avz dns-analy4:/export0/Work/zhaoliang/ENV-42/scripts/ \
/Users/zhaoliang/LocalRepos/envs/ENV-42/scripts/
References
- Arkime Documentation: https://arkime.com/
- Arkime db.pl:
/opt/arkime/db/db.pl --help - ENV-41 Configuration:
/Users/zhaoliang/LocalRepos/envs/ENV-41/ - ENV-42 Configuration:
/Users/zhaoliang/LocalRepos/envs/ENV-42/(local) //export0/Work/zhaoliang/ENV-42/(remote) - Import Script (ENV-41):
/Users/zhaoliang/LocalRepos/envs/ENV-41/scripts/import-env41-samples.sh --help - Import Script (ENV-42):
/Users/zhaoliang/LocalRepos/envs/ENV-42/scripts/import-with-progress.sh
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
amend-staged
Amend the last commit with staged changes only
test00-check
Check test00 honeynet production server status - pcap files, tcpdump processes, containers, and Google Drive sync. Use when user wants to check test00 data or server health.
commit-staged
Git commit commands for staged-only commits
git-submodule
Add a directory as a git submodule with its own remote repository. Use when user wants to track a directory with separate git history.
dagster-remote
Manage remote Dagster server on dns-analy4 - start, stop, restart, check status, sync code, and view logs. Use when user wants to manage the remote Dagster deployment.
arkime-tag-manager
Manage Arkime session tags (add/remove) using API
Didn't find tool you were looking for?