Tools
T1204, T1059YARA
How to write YARA rules for identifying malware -- string and hex pattern matching, condition logic, practical hunting rules, and integrating YARA into SOC detection pipelines.
View on Graph
What YARA Is and How Pattern-Matching Malware Detection Works
- YARA (Yet Another Recursive Acronym) is a tool for identifying and classifying malware samples.
- It works by matching rules — sets of strings, hex byte sequences, and boolean conditions — against files, processes, and memory.
- YARA was created by Victor Alvarez of VirusTotal and is the standard format for expressing malware detection logic in the security industry.
- Every major threat intelligence platform, malware analysis sandbox, and EDR uses YARA rules under the hood.
YARA Rule Structure
Every YARA rule has three main sections: meta, strings, and condition.
rule Example_Rule
{
meta:
description = "Detects a specific malware family"
author = "SOC Analyst"
date = "2026-05-23"
mitre_technique = "T1059.001"
severity = "high"
strings:
$string1 = "malicious_string_here"
$hex1 = { 90 90 90 90 E8 00 00 00 00 }
$regex1 = /https?:\/\/evil[0-9]+\.com/
condition:
$string1 or $hex1 or $regex1
}
Rule Components Explained
| Section | Purpose | Example |
|---|---|---|
rule | Rule name — descriptive, no spaces | rule Win32_Ransomware_LockBit |
meta | Metadata — not used for matching. Nice-to-have | author, date, description, mitre_id |
strings | The patterns to match. Multiple types | Text strings, hex bytes, regex |
condition | Boolean logic for when the rule triggers | $string1 and $string2, #strings > 3 |
Writing Practical YARA Rules
String-Based Rules — The Most Common
Single string match — simple:
rule Detect_Mimikatz_String
{
strings:
$mimikatz = "mimikatz"
condition:
$mimikatz
}
Multi-string match — better (reduces FPs):
rule Detect_Mimikatz_Comprehensive
{
strings:
$s1 = "mimikatz" nocase
$s2 = "sekurlsa" nocase
$s3 = "lsadump" nocase
$s4 = "kerberos" nocase
$s5 = "::" nocase (mimikatz uses :: as command separator)
condition:
2 of them
}
This rule matches any file that contains at least 2 of the specified strings. The nocase modifier makes matching case-insensitive. The 2 of them condition prevents FPs from a single false match (e.g., a legitimate tool with “mimikatz” in a comment). Combine with Sigma rules for detection logic that triggers YARA scans.
Hex Pattern Rules — For Binary Data
rule Detect_Shellcode_NOP_Sled
{
strings:
$nop_sled = { 90 90 90 90 90 }
condition:
$nop_sled
}
Hex fields explained:
{ 90 90 90 90 90 }— five consecutive NOP (0x90) instructions- Wildcards:
{ 90 ?? E8 ?? 00 00 }—??matches any byte
PE Header Rules — For Windows Executables
rule Detect_Suspicious_PE_Section
{
strings:
$section_name = ".text" fullword (normal code section name)
condition:
pe.number_of_sections > 10 (normal PE has < 10 sections)
}
PE-specific keywords:
| Keyword | What It Checks | Suspicious Value |
|---|---|---|
pe.number_of_sections | Count of PE sections | > 10 (packed malware may add sections) |
pe.sections[0].name | Name of first section | Empty, UPX0, .xyz (packed/packed) |
pe.entry_point | Offset of the entry point | Outside .text section = packer or injected code |
pe.exports("DllRegisterServer") | Exported function names | Missing expected exports in a legitimate binary |
pe.imphash() | Import hash (all imported functions) | Compare against known-good importhash |
pe.is_dll() | Is it a DLL? | A DLL in a temp directory |
pe.is_64bit() | 64-bit build | 32-bit binary with a 64-bit compile timestamp |
Example — detect UPX-packed files:
rule Detect_UPX_Packed
{
strings:
$upx1 = "UPX!" (UPX signature)
$upx2 = "UPX0" (UPX section name)
condition:
$upx1 or $upx2
}
Regex Rules — For Pattern Variation
rule Detect_PowerShell_EncodedCommand
{
strings:
$re1 = /-EncodedCommand\s+[A-Za-z0-9+\/=]{20,}/
condition:
$re1 and pe.is_dll() == false (likely malware, not a script)
}
When to use regex: The string has variable content (e.g., a base64 payload of any length) or a structured pattern (URL, IP address, email). Regex is slower than plain string matching — use sparingly.
Advanced Conditions
| Condition | What It Does | Example |
|---|---|---|
all of them | ALL strings must match | condition: all of them |
#s1 > 5 | 5 or more occurrences of string $s1 | condition: #s1 > 5 |
@s1 < 1000 | The first occurrence of $s1 is before offset 1000 | condition: @s1 < 1000 |
for any i in (1..#s1): ($s1[i] != $s2[i]) | Each occurrence of $s1 must NOT match $s2 | Advanced — ensures two different patterns are present |
$s1 at 0 | String appears at offset 0 | condition: $s1 at 0 (magic bytes) |
pe.timestamp > 1672531200 | PE compile timestamp after Jan 1, 2023 | condition: pe.timestamp > 1672531200 |
Hunting Workflow — Using YARA in the SOC
Step 1 — Define the Hypothesis
Before writing any rule, answer: What am I looking for, and where? Integrate with MISP to auto-generate YARA rules from threat intelligence events.
| Use Case | Hypothesis | Rule Type |
|---|---|---|
| Threat intel IoCs | ”We received intel that Sample.A uses unique strings in its binary” | String match (specific to this sample) |
| Malware family hunting | ”LockBit ransomware contains the string lockbit in its command-line output” | File-level string match |
| LOLBin abuse | ”We want to find all files that decode base64 and execute PowerShell” | Path + string match |
| Process injection | ”Find processes with executable memory regions containing MZ headers” | Memory scan |
Step 2 — Write the Rule
Use the rule structure above. Test on known samples first (VirusTotal, Joe Sandbox samples).
Step 3 — Scan (use REMnux for a pre-configured YARA environment)
# Scan a single file
yara my_rule.yar suspicious.exe
# Scan a directory recursively
yara -r my_rule.yar C:\Users\Public\
# Scan a running process by PID
yara -p 1234 my_rule.yar
# Scan all running processes
yara -p 1-9999 my_rule.yar
Step 4 — Validate and Tune
| Result | Action |
|---|---|
| Hit on known-bad | Rule works! Automate scanning. |
| Hit on known-good | False positive. Add exception in condition or use uint32 checks. |
| No hit on known-bad | Rule is too narrow. Check for alternate strings or hex patterns. |
| No hits on any file | Rule may be too broad or not matching. Check the sample manually. |
Integrating YARA into Detection Pipelines
Option 1 — YARA + EDR
Most modern EDRs support YARA scanning. Schedule scans of high-value hosts (Domain Controllers, file servers, admin workstations).
Option 2 — YARA + SIEM
# Cron job: scan new files in a monitored directory every hour
yara -r -m ~/rules/ /share/monitored/ | logger -t yara_scan
Option 3 — YARA + VirusTotal
Submit all detected files to VirusTotal for cross-reference. If VirusTotal agrees (2+ engines detect), escalate to incident.
Related
- Indicators: IoC, IoA, and TTP — covers the indicators: ioc, ioa, and ttp concepts
- MITRE ATT&CK for Triage — covers the mitre att&ck for triage concepts
- Living-off-the-Land Binaries — how living-off-the-land binaries attacks work and how to detect them
- Azure Sentinel — detection and response for T1654 techniques
- BloodHound — detection and response for T1087 techniques
