Tools

T1204

Ghidra

A practical guide to Ghidra for SOC analysts and malware reverse engineers — installation, the decompiler, disassembly analysis, scripting with Python/Java, collaborative analysis, and using Ghidra for malware triage.

View on Graph

What Ghidra Is and Why Analysts Use It

  • Ghidra is a reverse engineering (RE) framework developed by the NSA’s Research Directorate and released as open source in 2019. It includes a disassembler, decompiler, debugger (via GDB integration), and a scripting environment.
  • MITRE ATT&CK maps reverse engineering to supporting T1204 (User Execution) analysis — understanding what a malware sample does requires seeing the actual code, not just observing its behavior.
  • Where IDA Pro is the commercial standard (costing thousands per license), Ghidra is completely free and its decompiler is widely considered comparable to IDA’s Hex-Rays decompiler — the key difference being that Ghidra’s decompiler is included, not a paid add-on.
  • Ghidra’s collaborative server lets a team of analysts work on the same binary simultaneously — annotations, function names, and comments sync in real time.

Installation and Setup

System Requirements

RequirementMinimumRecommended
RAM4 GB16 GB (larger binaries need more)
JDKJDK 17JDK 21
Disk2 GB10 GB (script outputs, analysis caches)
OSWindows, macOS, LinuxLinux for best performance

Installation

# Linux — download and extract
wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_11.2_build/ghidra_11.2_PUBLIC_20241105.zip
unzip ghidra_11.2_PUBLIC_20241105.zip
cd ghidra_11.2_PUBLIC/

# Run Ghidra
./ghidraRun

# macOS
brew install --cask ghidra

# Windows — download the zip and run ghidraRun.bat

First Launch

  1. Create a new project (Non-Shared for single-user, Shared for collaborative)
  2. Import a binary (PE, ELF, Mach-O, or raw binary)
  3. Ghidra runs auto-analysis — this takes 30 seconds to several minutes depending on binary size

Key Features — Disassembler and Decompiler

Ghidra’s two most important views are the Listing (disassembly) and the Decompiler (pseudocode).

Listing (Disassembly)

The listing view shows the raw assembly instructions. Ghidra annotates addresses, opcodes, and operands, and uses color coding to distinguish code, data, and undefined bytes.

Address    Bytes           Instruction        Comment
00401000   55              PUSH  EBP          ; Save base pointer
00401001   8B EC           MOV   EBP, ESP     ; Set up stack frame
00401003   83 EC 0C        SUB   ESP, 0Ch     ; Allocate local variables
00401006   68 00 40 42 00  PUSH  0x424000     ; Push address of string
0040100B   E8 10 00 00 00  CALL  printf       ; Call printf
00401010   33 C0           XOR   EAX, EAX     ; Return 0
00401012   8B E5           MOV   ESP, EBP     ; Restore stack
00401014   5D              POP   EBP          ; Restore base pointer
00401015   C3              RET                ; Return

Decompiler

The decompiler converts assembly into a C-like pseudocode. This is Ghidra’s killer feature — reading decompiled code is far faster than reading assembly.

// Decompiled Windows API call pattern
void entry(void) {
  int iVar1;
  HANDLE hProcess;
  LPVOID lpBaseAddress;
  HANDLE hThread;
  
  hProcess = OpenProcess(PROCESS_ALL_ACCESS, 0, 0x1234);
  if (hProcess != (HANDLE)0x0) {
    lpBaseAddress = VirtualAllocEx(hProcess, (LPVOID)0x0, 0x100, 0x3000, 0x40);
    WriteProcessMemory(hProcess, lpBaseAddress, &shellcode, 0x100, (SIZE_T *)0x0);
    hThread = CreateRemoteThread(hProcess, (LPSECURITY_ATTRIBUTES)0x0, 0, 
                                 (LPTHREAD_START_ROUTINE)lpBaseAddress, (LPVOID)0x0, 0, (LPDWORD)0x0);
    if (hThread != (HANDLE)0x0) {
      WaitForSingleObject(hThread, 0xFFFFFFFF);
    }
  }
  return;
}

This decompiled output reveals: the malware opens another process, allocates memory, writes shellcode, and creates a remote thread — classic Process Injection (T1055.001).


Scripting Ghidra with Python (Jython) and Java

Python Scripting (Jython)

Ghidra uses Jython (Python on the JVM) for scripting. Scripts can automate analysis, extract data, or modify the program database.

# GetCurrentFunction.py — List all functions in the current binary
from ghidra.program.model.listing import Function

fm = currentProgram.getFunctionManager()
functions = fm.getFunctions(True)

print(f"Functions in {currentProgram.getName()}:")
for func in functions:
    print(f"  {func.getName()} @ 0x{func.getEntryPoint().toString()}")
# ExportStrings.py — extract all defined strings with locations
from ghidra.app.util.importer import AutoImporter

listing = currentProgram.getListing()
data_iter = listing.getDefinedData(True)

for data in data_iter:
    if data.isString():
        print(f"0x{data.getAddress().toString()}: {data.getDefaultValueRepresentation()}")
# FindMutexAPI.py — Find all calls to CreateMutex / CreateMutexEx
from ghidra.app.script import GhidraScript
from ghidra.program.model.symbol import SymbolType

fm = currentProgram.getFunctionManager()
mutex_functions = []

for symbol in currentProgram.getSymbolTable().getAllSymbols(True):
    if symbol.getName() in ["CreateMutexA", "CreateMutexW", "CreateMutexExA", "CreateMutexExW"]:
        if symbol.getSymbolType() == SymbolType.FUNCTION:
            func = fm.getFunctionAt(symbol.getAddress())
            if func:
                mutex_functions.append(func.getName())

print(f"Mutex-related functions found: {len(mutex_functions)}")

Java Scripting

Java scripts have full access to Ghidra’s API and are faster than Python scripts:

// FindStrings.java — Find all string references in the binary
import ghidra.app.script.GhidraScript;
import ghidra.program.model.listing.*;
import ghidra.program.model.address.*;
import ghidra.util.*;

public class FindStrings extends GhidraScript {
    @Override
    public void run() throws Exception {
        Listing listing = currentProgram.getListing();
        DataIterator dataIter = listing.getDefinedData(true);
        
        println("Strings found in " + currentProgram.getName() + ":");
        while (dataIter.hasNext()) {
            Data data = dataIter.next();
            if (data.isString()) {
                println(data.getAddress() + ": " + data.getDefaultValueRepresentation());
            }
        }
    }
}

Malware Analysis Workflow with Ghidra

Step 1 — Initial Import and Auto-Analysis

  1. Create new project
  2. Import the binary (Ghidra detects PE, ELF, Mach-O)
  3. Run auto-analysis — selects appropriate analyzers
  4. Review auto-analysis results (function discovery, stack analysis, data reference creation)

Step 2 — Identify Key Functions

Look for functions that import known malicious API calls:

API CallSuspicious UseTechnique
VirtualAllocEx + WriteProcessMemory + CreateRemoteThreadProcess injectionT1055.001
CreateFileA + WriteFile + DeleteFileADropping and deleting selfPersistence / defense evasion
CryptEncrypt, CryptDecryptEncrypting or decrypting payloadsDefense evasion
URLDownloadToFileADownloading secondary payloadT1105
RegSetValueExA to Run keyPersistence via registryT1547.001
WNetAddConnection2ALateral movementT1021
CreateProcessWithLogonWRunning commands as another userCredential abuse

Step 3 — Trace the Execution Flow

Use Ghidra’s Function Call Trees and Cross References (complement with Volatility for memory-level validation):

  1. Find entry() or WinMain
  2. Right-click → “References → Show References to Function”
  3. Trace the call tree — which functions call which
  4. Look for encryption/decryption loops (XOR, AES, RC4)

Step 4 — Extract IOCs

IOC TypeWhere to Find in GhidraHow to Check
C2 URLsData section, string tableSearch for http:// in defined strings (cross-ref with YARA signatures)
IP addressesData section or stack manipulationCheck integer constants pushed before connect() or send()
Mutex namesString table, then trace cross referencesStrings called with CreateMutex or OpenMutex
Registry keysString tableStrings passed to registry API calls
Encryption keysStack variables, data sectionConstants pushed before CryptEncrypt or custom XOR loops (decode with CyberChef)
File pathsString tableStrings referenced near file I/O API calls

Collaborative Analysis — Ghidra Server

Ghidra includes a server for multi-analyst collaboration:

# Start the Ghidra server (on the server machine)
./server/svrAdmin
./server/ghidraSvr
FeatureWhat It Does
Real-time syncMultiple analysts see changes as they happen (function names, comments, types)
Check-in/outPrevent conflicts — exclusive access to modified functions
Version historyFull revision history — see who changed what and when
User authenticationUsername/password, PKI, or LDAP

Ghidra vs IDA Pro — When to Use Which

FeatureGhidraIDA Pro
PriceFree$1,700+ (Pro); $5,000+ (Enterprise)
DecompilerIncluded (same quality as Hex-Rays)$1,200 add-on (Hex-Rays)
GUIJava-based (can be slow on large databases)Native, more responsive
ScriptingPython (Jython) + JavaPython (IDAPython) + C++
CollaborationBuilt-in serverIDA Team (paid add-on)
DebuggerGDB integration onlyBuilt-in debugger (WinDbg, GDB, Bochs)
Mobile/embeddedGrowing supportMature support
CommunityActive but smallerLarge, mature community
Best forBudget-conscious teams, collaborative analysis, decompiler-first workflowSingle-analyst deep RE, performance-sensitive analysis, embedded/mobile

Sources