Agent skill
performing-log-source-onboarding-in-siem
Perform structured log source onboarding into SIEM platforms by configuring collectors, parsers, normalization, and validation for complete security visibility.
Install this agent skill to your Project
npx add-skill https://github.com/autohandai/community-skills/tree/main/performing-log-source-onboarding-in-siem
SKILL.md
Performing Log Source Onboarding in SIEM
Overview
Log source onboarding is the systematic process of integrating new data sources into a SIEM platform to enable security monitoring and detection. Proper onboarding requires planning data sources, configuring collection agents, building parsers, normalizing fields to a common schema, and validating data quality. According to the UK NCSC, onboarding should prioritize log sources that provide the highest security value relative to their ingestion cost.
Prerequisites
- SIEM platform deployed (Splunk, Elastic, Sentinel, QRadar, or similar)
- Network access from source systems to SIEM collectors
- Administrative access on source systems for agent installation
- Common Information Model (CIM) or equivalent schema documentation
- Change management approval for production system modifications
Log Source Priority Framework
Tier 1 - Critical (Onboard First)
| Source | Log Type | Security Value |
|---|---|---|
| Active Directory | Security Event Logs | Authentication, privilege escalation |
| Firewalls | Traffic logs | Network access, C2 detection |
| EDR/AV | Endpoint alerts | Malware, process execution |
| VPN/Remote Access | Connection logs | Unauthorized access |
| DNS Servers | Query logs | C2 beaconing, data exfiltration |
| Email Gateway | Email security logs | Phishing, BEC |
Tier 2 - High Priority
| Source | Log Type | Security Value |
|---|---|---|
| Web Proxy | HTTP/HTTPS logs | Web-based attacks, data exfiltration |
| Cloud platforms (AWS/Azure/GCP) | Audit logs | Cloud security posture |
| Database servers | Audit/query logs | Data access, SQL injection |
| DHCP/IPAM | Address allocation | Asset tracking |
| File servers | Access logs | Data access monitoring |
Tier 3 - Standard
| Source | Log Type | Security Value |
|---|---|---|
| Application servers | App logs | Application-level attacks |
| Print servers | Print logs | Data loss prevention |
| Badge/physical access | Access logs | Physical security correlation |
| Network devices (switches/routers) | Syslog | Network anomalies |
Onboarding Process
Step 1: Discovery and Assessment
1. Identify the log source:
- System type and version
- Log format (syslog, CEF, JSON, Windows Events, etc.)
- Log volume estimate (EPS - events per second)
- Network location and firewall requirements
2. Assess security value:
- What threats can this source help detect?
- Which MITRE ATT&CK techniques does it cover?
- Is there an existing SIEM parser?
3. Estimate ingestion cost:
- Daily volume in GB
- License impact (per-GB or per-EPS pricing)
- Storage retention requirements
Step 2: Configure Log Collection
Syslog-Based Collection (Firewalls, Network Devices)
# rsyslog configuration for receiving syslog
# /etc/rsyslog.d/10-siem-collection.conf
# UDP reception
module(load="imudp")
input(type="imudp" port="514" ruleset="siem_forwarding")
# TCP reception
module(load="imtcp")
input(type="imtcp" port="514" ruleset="siem_forwarding")
# TLS reception
module(load="imtcp" StreamDriver.AuthMode="x509/name"
StreamDriver.Mode="1" StreamDriver.Name="gtls")
input(type="imtcp" port="6514" ruleset="siem_forwarding")
ruleset(name="siem_forwarding") {
# Forward to SIEM
action(type="omfwd" target="siem.company.com" port="9514"
protocol="tcp" queue.type="LinkedList"
queue.filename="siem_fwd" queue.maxdiskspace="1g"
queue.saveonshutdown="on" action.resumeRetryCount="-1")
}
Windows Event Log Collection (Splunk Universal Forwarder)
# inputs.conf on Splunk Universal Forwarder
[WinEventLog://Security]
disabled = 0
index = wineventlog
sourcetype = WinEventLog:Security
evt_resolve_ad_obj = 1
checkpointInterval = 5
[WinEventLog://System]
disabled = 0
index = wineventlog
sourcetype = WinEventLog:System
[WinEventLog://Microsoft-Windows-Sysmon/Operational]
disabled = 0
index = wineventlog
sourcetype = XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
renderXml = true
[WinEventLog://Microsoft-Windows-PowerShell/Operational]
disabled = 0
index = wineventlog
sourcetype = XmlWinEventLog:Microsoft-Windows-PowerShell/Operational
Cloud Log Collection (AWS CloudTrail)
{
"AWSTemplateFormatVersion": "2010-09-09",
"Resources": {
"CloudTrailToSIEM": {
"Type": "AWS::CloudTrail::Trail",
"Properties": {
"TrailName": "siem-cloudtrail",
"S3BucketName": "company-cloudtrail-logs",
"IsLogging": true,
"IsMultiRegionTrail": true,
"IncludeGlobalServiceEvents": true,
"EnableLogFileValidation": true,
"EventSelectors": [
{
"ReadWriteType": "All",
"IncludeManagementEvents": true,
"DataResources": [
{
"Type": "AWS::S3::Object",
"Values": ["arn:aws:s3"]
}
]
}
]
}
}
}
}
Step 3: Parse and Normalize
Custom Parser Example (Splunk props.conf/transforms.conf)
# props.conf
[custom:firewall:logs]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S%z
MAX_TIMESTAMP_LOOKAHEAD = 30
TRANSFORMS-firewall = firewall_extract_fields
FIELDALIAS-src = src_addr AS src_ip
FIELDALIAS-dst = dst_addr AS dest_ip
EVAL-action = case(fw_action=="allow", "allowed", fw_action=="deny", "blocked", true(), "unknown")
EVAL-vendor_product = "Custom Firewall"
LOOKUP-geo = geo_ip_lookup ip AS dest_ip OUTPUT country, city, latitude, longitude
# transforms.conf
[firewall_extract_fields]
REGEX = ^(\S+)\s+(\S+)\s+action=(\w+)\s+src=(\S+):(\d+)\s+dst=(\S+):(\d+)\s+proto=(\w+)\s+bytes=(\d+)
FORMAT = timestamp::$1 hostname::$2 fw_action::$3 src_addr::$4 src_port::$5 dst_addr::$6 dst_port::$7 protocol::$8 bytes::$9
CIM Field Mapping
| Raw Field | CIM Field | Data Model |
|---|---|---|
| src_addr | src_ip | Network_Traffic |
| dst_addr | dest_ip | Network_Traffic |
| dst_port | dest_port | Network_Traffic |
| fw_action | action | Network_Traffic |
| bytes_sent + bytes_recv | bytes | Network_Traffic |
| user_name | user | Authentication |
| login_result | action | Authentication |
| process_path | process | Endpoint |
Step 4: Validate Data Quality
# Verify events are arriving
index=new_source earliest=-1h
| stats count by sourcetype, host, source
# Check field extraction quality
index=new_source earliest=-1h
| stats count(src_ip) as has_src count(dest_ip) as has_dest count(action) as has_action count by sourcetype
| eval src_coverage=round(has_src/count*100,1)
| eval dest_coverage=round(has_dest/count*100,1)
| eval action_coverage=round(has_action/count*100,1)
# Verify CIM compliance
| datamodel Network_Traffic search
| search sourcetype=new_sourcetype
| stats count by source, sourcetype
# Check for timestamp parsing issues
index=new_source earliest=-1h
| eval time_diff=abs(_time - _indextime)
| stats avg(time_diff) as avg_lag max(time_diff) as max_lag by host
| where avg_lag > 300
Step 5: Enable Detection Coverage
# Verify existing correlation searches work with new source
index=new_source sourcetype=new_sourcetype
| tstats count from datamodel=Authentication by _time span=1h
| timechart span=1h count
# Create source-specific detection rule
[New Source - Authentication Anomaly]
search = index=new_source sourcetype=new_sourcetype action=failure \
| stats count by src_ip, user \
| where count > 10
Onboarding Checklist
- Log source assessed and approved
- Network connectivity verified
- Collection agent/method configured
- Log forwarding confirmed
- Parser/field extraction configured
- CIM compliance validated
- Data model acceleration enabled
- Volume within license budget
- Retention policy configured
- Detection rules enabled/created
- Dashboard updated
- Documentation completed
- SOC team notified
References
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
mapping-mitre-attack-techniques
Maps observed adversary behaviors, security alerts, and detection rules to MITRE ATT&CK techniques and sub-techniques to quantify detection coverage and guide control prioritization. Use when building an ATT&CK-based coverage heatmap, tagging SIEM alerts with technique IDs, aligning security controls to adversary playbooks, or reporting threat exposure to executives. Activates for requests involving ATT&CK Navigator, Sigma rules, MITRE D3FEND, or coverage gap analysis.
hunting-for-spearphishing-indicators
Hunt for spearphishing campaign indicators across email logs, endpoint telemetry, and network data to detect targeted email attacks.
analyzing-malicious-url-with-urlscan
URLScan.io is a free service for scanning and analyzing suspicious URLs. It captures screenshots, DOM content, HTTP transactions, JavaScript behavior, and network connections of web pages in an isolat
implementing-zero-standing-privilege-with-cyberark
Deploy CyberArk Secure Cloud Access to eliminate standing privileges in hybrid and multi-cloud environments using just-in-time access with time, entitlement, and approval controls.
implementing-pam-for-database-access
Deploy privileged access management for database systems including Oracle, SQL Server, PostgreSQL, and MySQL. Covers session proxy configuration, credential vaulting, query auditing, dynamic credentia
detecting-t1003-credential-dumping-with-edr
Detect OS credential dumping techniques targeting LSASS memory, SAM database, NTDS.dit, and cached credentials using EDR telemetry, Sysmon process access monitoring, and Windows security event correlation.
Didn't find tool you were looking for?