Agent skill
reverse-engineering-malware-with-ghidra
Reverse engineers malware binaries using NSA's Ghidra disassembler and decompiler to understand internal logic, cryptographic routines, C2 protocols, and evasion techniques at the assembly and pseudo-C level. Activates for requests involving malware reverse engineering, disassembly analysis, decompilation, binary analysis, or understanding malware internals.
Install this agent skill to your Project
npx add-skill https://github.com/autohandai/community-skills/tree/main/reverse-engineering-malware-with-ghidra
SKILL.md
Reverse Engineering Malware with Ghidra
When to Use
- Static and dynamic analysis have identified suspicious functionality that requires deeper code-level understanding
- You need to reverse engineer C2 communication protocols, encryption algorithms, or custom obfuscation
- Understanding the exact exploit mechanism or vulnerability targeted by a malware sample
- Extracting hardcoded configuration data (C2 addresses, encryption keys, campaign IDs) embedded in compiled code
- Developing precise YARA rules or detection signatures based on unique code patterns
Do not use for initial triage of unknown samples; perform static analysis with PEStudio and behavioral analysis with Cuckoo first.
Prerequisites
- Ghidra 11.x installed (download from https://ghidra-sre.org/) with JDK 17+
- Analysis VM isolated from production network (Windows or Linux host)
- Familiarity with x86/x64 assembly language and Windows API conventions
- PDB symbol files for Windows system DLLs to improve decompilation accuracy
- Ghidra scripts repository (ghidra_scripts) for automated analysis tasks
- Secondary reference: IDA Free or Binary Ninja for cross-validation of analysis results
Workflow
Step 1: Create Project and Import Binary
Set up a Ghidra project and import the malware sample:
1. Launch Ghidra: ghidraRun (Linux) or ghidraRun.bat (Windows)
2. File -> New Project -> Non-Shared Project -> Select directory
3. File -> Import File -> Select malware binary
4. Ghidra auto-detects format (PE, ELF, Mach-O) and architecture
5. Accept default import options (or specify base address if known)
6. Double-click imported file to open in CodeBrowser
7. When prompted, run Auto Analysis with default analyzers enabled
Headless analysis for automation:
# Run Ghidra headless analysis with decompiler
/opt/ghidra/support/analyzeHeadless /tmp/ghidra_project MalwareProject \
-import suspect.exe \
-postScript ExportDecompilation.py \
-scriptPath /opt/ghidra/scripts/ \
-deleteProject
Step 2: Identify Key Functions and Entry Points
Navigate the binary to locate critical code sections:
Navigation Strategy:
━━━━━━━━━━━━━━━━━━━
1. Start at entry point (OEP) - follow execution from _start/WinMain
2. Check Symbol Tree for imported functions (Window -> Symbol Tree)
3. Search for cross-references to suspicious APIs:
- VirtualAlloc/VirtualAllocEx (memory allocation for injection)
- CreateRemoteThread (remote thread injection)
- CryptEncrypt/CryptDecrypt (encryption operations)
- InternetOpen/HttpSendRequest (C2 communication)
- RegSetValueEx (persistence via registry)
4. Use Search -> For Strings to find embedded URLs, IPs, and paths
5. Check the Functions window sorted by size (large functions often contain core logic)
Ghidra keyboard shortcuts for efficient navigation:
G - Go to address
Ctrl+E - Search for strings
X - Show cross-references to current location
Ctrl+Shift+F - Search memory for byte patterns
L - Rename label/function
; - Add comment
T - Retype variable
Ctrl+L - Retype return value
Step 3: Analyze Decompiled Code
Use Ghidra's decompiler to understand function logic:
// Example: Ghidra decompiler output for a decryption routine
// Analyst renames variables and adds types for clarity
void decrypt_config(BYTE *encrypted_data, int data_len, BYTE *key, int key_len) {
// XOR decryption with rolling key
for (int i = 0; i < data_len; i++) {
encrypted_data[i] = encrypted_data[i] ^ key[i % key_len];
}
return;
}
// Analyst actions in Ghidra:
// 1. Right-click parameters -> Retype to correct types (BYTE*, int)
// 2. Right-click variables -> Rename to meaningful names
// 3. Add comments explaining the algorithm
// 4. Set function signature to propagate types to callers
Step 4: Trace C2 Communication Logic
Follow the network communication code path:
Analysis Steps for C2 Protocol Reverse Engineering:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1. Find InternetOpenA/WinHttpOpen call -> trace to wrapper function
2. Follow data flow from encrypted config -> URL construction
3. Identify HTTP method (GET/POST), headers, and body format
4. Locate response parsing logic (JSON parsing, custom binary protocol)
5. Map the C2 command dispatcher (switch/case or jump table)
6. Document the command set (download, execute, exfiltrate, update, uninstall)
Ghidra Script for extracting C2 configuration:
# Ghidra Python script: extract_c2_config.py
# Run via Script Manager in Ghidra
from ghidra.program.model.data import StringDataType
from ghidra.program.model.symbol import SourceType
# Search for XOR decryption patterns
listing = currentProgram.getListing()
memory = currentProgram.getMemory()
# Find references to InternetOpenA
symbol_table = currentProgram.getSymbolTable()
for symbol in symbol_table.getExternalSymbols():
if "InternetOpen" in symbol.getName():
refs = getReferencesTo(symbol.getAddress())
for ref in refs:
print("C2 init at: {}".format(ref.getFromAddress()))
Step 5: Analyze Encryption and Obfuscation
Identify and document cryptographic routines:
Common Malware Encryption Patterns:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
XOR Cipher: Loop with XOR operation, often single-byte or rolling key
RC4: Two loops (KSA + PRGA), 256-byte S-box initialization
AES: Look for S-box constants (0x63, 0x7C, 0x77...) or calls to CryptEncrypt
Base64: Lookup table with A-Za-z0-9+/= characters
Custom: Combination of arithmetic operations (ADD, SUB, ROL, ROR with XOR)
Identification Tips:
- Search for constants: AES S-box, CRC32 table, MD5 init values
- Look for loop structures operating on byte arrays
- Check for Windows Crypto API usage (CryptAcquireContext -> CryptCreateHash -> CryptEncrypt)
- FindCrypt Ghidra plugin automatically identifies crypto constants
Step 6: Document Findings and Create Detection Signatures
Produce actionable intelligence from reverse engineering:
# Generate YARA rule from unique code patterns found in Ghidra
cat << 'EOF' > malware_family_x.yar
rule MalwareFamilyX_Decryptor {
meta:
description = "Detects MalwareX decryption routine"
author = "analyst"
date = "2025-09-15"
strings:
// XOR decryption loop with hardcoded key
$decrypt = { 8A 04 0E 32 04 0F 88 04 0E 41 3B CA 7C F3 }
// C2 URL pattern after decryption
$c2_pattern = "/gate.php?id=" ascii
condition:
uint16(0) == 0x5A4D and $decrypt and $c2_pattern
}
EOF
Key Concepts
| Term | Definition |
|---|---|
| Disassembly | Converting machine code bytes into human-readable assembly language instructions; Ghidra's Listing view shows disassembled code |
| Decompilation | Lifting assembly code to pseudo-C representation for easier analysis; Ghidra's Decompile window provides this view |
| Cross-Reference (XREF) | Reference showing where a function or data address is called from or used; essential for tracing code execution flow |
| Control Flow Graph (CFG) | Visual representation of all possible execution paths through a function; reveals branching logic and loops |
| Original Entry Point (OEP) | The actual start address of the malware code after unpacking; packers redirect execution through an unpacking stub first |
| Function Signature | The return type, name, and parameter types of a function; applying correct signatures improves decompiler output quality |
| Ghidra Script | Python or Java automation script executed within Ghidra to perform batch analysis, pattern searching, or data extraction |
Tools & Systems
- Ghidra: NSA's open-source software reverse engineering suite with disassembler, decompiler, and scripting support for multiple architectures
- IDA Pro/Free: Industry-standard interactive disassembler; IDA Free provides x86/x64 cloud-based decompilation
- Binary Ninja: Commercial reverse engineering platform with modern UI and extensive API for plugin development
- x64dbg: Open-source x64/x32 debugger for Windows used alongside Ghidra for dynamic debugging of malware
- FindCrypt (Ghidra Plugin): Plugin that identifies cryptographic constants and algorithms in binary code
Common Scenarios
Scenario: Reversing Custom C2 Protocol
Context: Behavioral analysis shows encrypted traffic to an external IP on a non-standard port. Network signatures cannot detect variants because the protocol is proprietary. Deep reverse engineering is needed to understand the protocol structure.
Approach:
- Import the unpacked sample into Ghidra and run full auto-analysis
- Locate socket/WinHTTP API calls and trace backwards to the calling function
- Identify the encryption routine called before data is sent (follow data flow from send/HttpSendRequest)
- Reverse the encryption (XOR key extraction, RC4 key derivation, AES key location)
- Map the command structure by analyzing the response parsing function (switch/case on command IDs)
- Document the protocol format (header structure, command bytes, encryption method)
- Create a protocol decoder script for network monitoring tools
Pitfalls:
- Not running the full auto-analysis before starting manual analysis (missing function boundaries and type propagation)
- Ignoring indirect calls through function pointers or vtables (use cross-references to data holding function addresses)
- Spending time on library code that Ghidra's Function ID (FID) or FLIRT signatures should have identified
- Not saving Ghidra project progress frequently (analysis state can be lost on crashes)
Output Format
REVERSE ENGINEERING ANALYSIS REPORT
=====================================
Sample: unpacked_payload.exe
SHA-256: abc123def456...
Architecture: x86 (32-bit PE)
Ghidra Project: MalwareX_Analysis
FUNCTION MAP
0x00401000 main() - Entry point, initializes config
0x00401200 decrypt_config() - XOR decryption with 16-byte key
0x00401400 init_c2() - WinHTTP initialization, URL construction
0x00401800 c2_beacon() - HTTP POST beacon with system info
0x00401C00 cmd_dispatcher() - Switch on 12 command codes
0x00402000 inject_process() - Process hollowing into svchost.exe
0x00402400 persist_registry() - HKCU Run key persistence
0x00402800 exfil_data() - File collection and encrypted upload
C2 PROTOCOL
Method: HTTPS POST to /gate.php
Encryption: RC4 with derived key (MD5 of bot_id + campaign_key)
Bot ID Format: MD5(hostname + username + volume_serial)
Beacon Interval: 60 seconds with 10% jitter
Command Set:
0x01 - Download and execute file
0x02 - Execute shell command
0x03 - Upload file to C2
0x04 - Update configuration
0x05 - Uninstall and remove traces
ENCRYPTION DETAILS
Algorithm: RC4
Key Derivation: MD5(bot_id + "campaign_2025_q3")
Hardcoded Seed: "campaign_2025_q3" at offset 0x00405A00
EXTRACTED IOCs
C2 URLs: hxxps://update.malicious[.]com/gate.php
hxxps://backup.evil[.]net/gate.php (failover)
Campaign ID: campaign_2025_q3
RC4 Key Material: [see encryption details above]
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
mapping-mitre-attack-techniques
Maps observed adversary behaviors, security alerts, and detection rules to MITRE ATT&CK techniques and sub-techniques to quantify detection coverage and guide control prioritization. Use when building an ATT&CK-based coverage heatmap, tagging SIEM alerts with technique IDs, aligning security controls to adversary playbooks, or reporting threat exposure to executives. Activates for requests involving ATT&CK Navigator, Sigma rules, MITRE D3FEND, or coverage gap analysis.
hunting-for-spearphishing-indicators
Hunt for spearphishing campaign indicators across email logs, endpoint telemetry, and network data to detect targeted email attacks.
analyzing-malicious-url-with-urlscan
URLScan.io is a free service for scanning and analyzing suspicious URLs. It captures screenshots, DOM content, HTTP transactions, JavaScript behavior, and network connections of web pages in an isolat
implementing-zero-standing-privilege-with-cyberark
Deploy CyberArk Secure Cloud Access to eliminate standing privileges in hybrid and multi-cloud environments using just-in-time access with time, entitlement, and approval controls.
implementing-pam-for-database-access
Deploy privileged access management for database systems including Oracle, SQL Server, PostgreSQL, and MySQL. Covers session proxy configuration, credential vaulting, query auditing, dynamic credentia
detecting-t1003-credential-dumping-with-edr
Detect OS credential dumping techniques targeting LSASS memory, SAM database, NTDS.dit, and cached credentials using EDR telemetry, Sysmon process access monitoring, and Windows security event correlation.
Didn't find tool you were looking for?