Ghidra

What Ghidra Is and Why Analysts Use It

Ghidra is a reverse engineering (RE) framework developed by the NSA’s Research Directorate and released as open source in 2019. It includes a disassembler, decompiler, debugger (via GDB integration), and a scripting environment.
MITRE ATT&CK maps reverse engineering to supporting T1204 (User Execution) analysis — understanding what a malware sample does requires seeing the actual code, not just observing its behavior.
Where IDA Pro is the commercial standard (costing thousands per license), Ghidra is completely free and its decompiler is widely considered comparable to IDA’s Hex-Rays decompiler — the key difference being that Ghidra’s decompiler is included, not a paid add-on.
Ghidra’s collaborative server lets a team of analysts work on the same binary simultaneously — annotations, function names, and comments sync in real time.

Installation and Setup

System Requirements

Requirement	Minimum	Recommended
RAM	4 GB	16 GB (larger binaries need more)
JDK	JDK 17	JDK 21
Disk	2 GB	10 GB (script outputs, analysis caches)
OS	Windows, macOS, Linux	Linux for best performance

Installation

# Linux — download and extract
wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_11.2_build/ghidra_11.2_PUBLIC_20241105.zip
unzip ghidra_11.2_PUBLIC_20241105.zip
cd ghidra_11.2_PUBLIC/

# Run Ghidra
./ghidraRun

# macOS
brew install --cask ghidra

# Windows — download the zip and run ghidraRun.bat

First Launch

Create a new project (Non-Shared for single-user, Shared for collaborative)
Import a binary (PE, ELF, Mach-O, or raw binary)
Ghidra runs auto-analysis — this takes 30 seconds to several minutes depending on binary size

Key Features — Disassembler and Decompiler

Ghidra’s two most important views are the Listing (disassembly) and the Decompiler (pseudocode).

Listing (Disassembly)

The listing view shows the raw assembly instructions. Ghidra annotates addresses, opcodes, and operands, and uses color coding to distinguish code, data, and undefined bytes.

Address    Bytes           Instruction        Comment
00401000   55              PUSH  EBP          ; Save base pointer
00401001   8B EC           MOV   EBP, ESP     ; Set up stack frame
00401003   83 EC 0C        SUB   ESP, 0Ch     ; Allocate local variables
00401006   68 00 40 42 00  PUSH  0x424000     ; Push address of string
0040100B   E8 10 00 00 00  CALL  printf       ; Call printf
00401010   33 C0           XOR   EAX, EAX     ; Return 0
00401012   8B E5           MOV   ESP, EBP     ; Restore stack
00401014   5D              POP   EBP          ; Restore base pointer
00401015   C3              RET                ; Return

Decompiler

The decompiler converts assembly into a C-like pseudocode. This is Ghidra’s killer feature — reading decompiled code is far faster than reading assembly.

// Decompiled Windows API call pattern
void entry(void) {
  int iVar1;
  HANDLE hProcess;
  LPVOID lpBaseAddress;
  HANDLE hThread;
  
  hProcess = OpenProcess(PROCESS_ALL_ACCESS, 0, 0x1234);
  if (hProcess != (HANDLE)0x0) {
    lpBaseAddress = VirtualAllocEx(hProcess, (LPVOID)0x0, 0x100, 0x3000, 0x40);
    WriteProcessMemory(hProcess, lpBaseAddress, &shellcode, 0x100, (SIZE_T *)0x0);
    hThread = CreateRemoteThread(hProcess, (LPSECURITY_ATTRIBUTES)0x0, 0, 
                                 (LPTHREAD_START_ROUTINE)lpBaseAddress, (LPVOID)0x0, 0, (LPDWORD)0x0);
    if (hThread != (HANDLE)0x0) {
      WaitForSingleObject(hThread, 0xFFFFFFFF);
    }
  }
  return;
}

This decompiled output reveals: the malware opens another process, allocates memory, writes shellcode, and creates a remote thread — classic Process Injection (T1055.001).

Scripting Ghidra with Python (Jython) and Java

Python Scripting (Jython)

Ghidra uses Jython (Python on the JVM) for scripting. Scripts can automate analysis, extract data, or modify the program database.

# GetCurrentFunction.py — List all functions in the current binary
from ghidra.program.model.listing import Function

fm = currentProgram.getFunctionManager()
functions = fm.getFunctions(True)

print(f"Functions in {currentProgram.getName()}:")
for func in functions:
    print(f"  {func.getName()} @ 0x{func.getEntryPoint().toString()}")

# ExportStrings.py — extract all defined strings with locations
from ghidra.app.util.importer import AutoImporter

listing = currentProgram.getListing()
data_iter = listing.getDefinedData(True)

for data in data_iter:
    if data.isString():
        print(f"0x{data.getAddress().toString()}: {data.getDefaultValueRepresentation()}")

# FindMutexAPI.py — Find all calls to CreateMutex / CreateMutexEx
from ghidra.app.script import GhidraScript
from ghidra.program.model.symbol import SymbolType

fm = currentProgram.getFunctionManager()
mutex_functions = []

for symbol in currentProgram.getSymbolTable().getAllSymbols(True):
    if symbol.getName() in ["CreateMutexA", "CreateMutexW", "CreateMutexExA", "CreateMutexExW"]:
        if symbol.getSymbolType() == SymbolType.FUNCTION:
            func = fm.getFunctionAt(symbol.getAddress())
            if func:
                mutex_functions.append(func.getName())

print(f"Mutex-related functions found: {len(mutex_functions)}")

Java Scripting

Java scripts have full access to Ghidra’s API and are faster than Python scripts:

// FindStrings.java — Find all string references in the binary
import ghidra.app.script.GhidraScript;
import ghidra.program.model.listing.*;
import ghidra.program.model.address.*;
import ghidra.util.*;

public class FindStrings extends GhidraScript {
    @Override
    public void run() throws Exception {
        Listing listing = currentProgram.getListing();
        DataIterator dataIter = listing.getDefinedData(true);
        
        println("Strings found in " + currentProgram.getName() + ":");
        while (dataIter.hasNext()) {
            Data data = dataIter.next();
            if (data.isString()) {
                println(data.getAddress() + ": " + data.getDefaultValueRepresentation());
            }
        }
    }
}

Malware Analysis Workflow with Ghidra

Step 1 — Initial Import and Auto-Analysis

Create new project
Import the binary (Ghidra detects PE, ELF, Mach-O)
Run auto-analysis — selects appropriate analyzers
Review auto-analysis results (function discovery, stack analysis, data reference creation)

Step 2 — Identify Key Functions

Look for functions that import known malicious API calls:

API Call	Suspicious Use	Technique
`VirtualAllocEx` + `WriteProcessMemory` + `CreateRemoteThread`	Process injection	`T1055.001`
`CreateFileA` + `WriteFile` + `DeleteFileA`	Dropping and deleting self	Persistence / defense evasion
`CryptEncrypt`, `CryptDecrypt`	Encrypting or decrypting payloads	Defense evasion
`URLDownloadToFileA`	Downloading secondary payload	`T1105`
`RegSetValueExA` to Run key	Persistence via registry	`T1547.001`
`WNetAddConnection2A`	Lateral movement	`T1021`
`CreateProcessWithLogonW`	Running commands as another user	Credential abuse

Step 3 — Trace the Execution Flow

Use Ghidra’s Function Call Trees and Cross References (complement with Volatility for memory-level validation):

Find entry() or WinMain
Right-click → “References → Show References to Function”
Trace the call tree — which functions call which
Look for encryption/decryption loops (XOR, AES, RC4)

Step 4 — Extract IOCs

IOC Type	Where to Find in Ghidra	How to Check
C2 URLs	Data section, string table	Search for `http://` in defined strings (cross-ref with YARA signatures)
IP addresses	Data section or stack manipulation	Check integer constants pushed before `connect()` or `send()`
Mutex names	String table, then trace cross references	Strings called with `CreateMutex` or `OpenMutex`
Registry keys	String table	Strings passed to registry API calls
Encryption keys	Stack variables, data section	Constants pushed before `CryptEncrypt` or custom XOR loops (decode with CyberChef)
File paths	String table	Strings referenced near file I/O API calls

Collaborative Analysis — Ghidra Server

Ghidra includes a server for multi-analyst collaboration:

# Start the Ghidra server (on the server machine)
./server/svrAdmin
./server/ghidraSvr

Feature	What It Does
Real-time sync	Multiple analysts see changes as they happen (function names, comments, types)
Check-in/out	Prevent conflicts — exclusive access to modified functions
Version history	Full revision history — see who changed what and when
User authentication	Username/password, PKI, or LDAP

Ghidra vs IDA Pro — When to Use Which

Feature	Ghidra	IDA Pro
Price	Free	$1,700+ (Pro); $5,000+ (Enterprise)
Decompiler	Included (same quality as Hex-Rays)	$1,200 add-on (Hex-Rays)
GUI	Java-based (can be slow on large databases)	Native, more responsive
Scripting	Python (Jython) + Java	Python (IDAPython) + C++
Collaboration	Built-in server	IDA Team (paid add-on)
Debugger	GDB integration only	Built-in debugger (WinDbg, GDB, Bochs)
Mobile/embedded	Growing support	Mature support
Community	Active but smaller	Large, mature community
Best for	Budget-conscious teams, collaborative analysis, decompiler-first workflow	Single-analyst deep RE, performance-sensitive analysis, embedded/mobile

REMnux — detection and response for T1204 techniques
EDR Basics — detection and response for T1059, T1003, T1055, T1204, T1562 techniques
Indicators: IoC, IoA, and TTP — covers the indicators: ioc, ioa, and ttp concepts
Kill Chain — covers the kill chain concepts
Log Sources Overview — covers the log sources overview concepts