Tools

T1204, T1059

YARA

How to write YARA rules for identifying malware -- string and hex pattern matching, condition logic, practical hunting rules, and integrating YARA into SOC detection pipelines.

View on Graph

What YARA Is and How Pattern-Matching Malware Detection Works

  • YARA (Yet Another Recursive Acronym) is a tool for identifying and classifying malware samples.
  • It works by matching rules — sets of strings, hex byte sequences, and boolean conditions — against files, processes, and memory.
  • YARA was created by Victor Alvarez of VirusTotal and is the standard format for expressing malware detection logic in the security industry.
  • Every major threat intelligence platform, malware analysis sandbox, and EDR uses YARA rules under the hood.

YARA Rule Structure

Every YARA rule has three main sections: meta, strings, and condition.

rule Example_Rule
{
    meta:
        description = "Detects a specific malware family"
        author = "SOC Analyst"
        date = "2026-05-23"
        mitre_technique = "T1059.001"
        severity = "high"

    strings:
        $string1 = "malicious_string_here"
        $hex1 = { 90 90 90 90 E8 00 00 00 00 }
        $regex1 = /https?:\/\/evil[0-9]+\.com/

    condition:
        $string1 or $hex1 or $regex1
}

Rule Components Explained

SectionPurposeExample
ruleRule name — descriptive, no spacesrule Win32_Ransomware_LockBit
metaMetadata — not used for matching. Nice-to-haveauthor, date, description, mitre_id
stringsThe patterns to match. Multiple typesText strings, hex bytes, regex
conditionBoolean logic for when the rule triggers$string1 and $string2, #strings > 3

Writing Practical YARA Rules

String-Based Rules — The Most Common

Single string match — simple:

rule Detect_Mimikatz_String
{
    strings:
        $mimikatz = "mimikatz"
    condition:
        $mimikatz
}

Multi-string match — better (reduces FPs):

rule Detect_Mimikatz_Comprehensive
{
    strings:
        $s1 = "mimikatz" nocase
        $s2 = "sekurlsa" nocase
        $s3 = "lsadump" nocase
        $s4 = "kerberos" nocase
        $s5 = "::" nocase   (mimikatz uses :: as command separator)
    condition:
        2 of them
}

This rule matches any file that contains at least 2 of the specified strings. The nocase modifier makes matching case-insensitive. The 2 of them condition prevents FPs from a single false match (e.g., a legitimate tool with “mimikatz” in a comment). Combine with Sigma rules for detection logic that triggers YARA scans.

Hex Pattern Rules — For Binary Data

rule Detect_Shellcode_NOP_Sled
{
    strings:
        $nop_sled = { 90 90 90 90 90 }
    condition:
        $nop_sled
}

Hex fields explained:

  • { 90 90 90 90 90 } — five consecutive NOP (0x90) instructions
  • Wildcards: { 90 ?? E8 ?? 00 00 }?? matches any byte

PE Header Rules — For Windows Executables

rule Detect_Suspicious_PE_Section
{
    strings:
        $section_name = ".text" fullword  (normal code section name)
    condition:
        pe.number_of_sections > 10  (normal PE has < 10 sections)
}

PE-specific keywords:

KeywordWhat It ChecksSuspicious Value
pe.number_of_sectionsCount of PE sections> 10 (packed malware may add sections)
pe.sections[0].nameName of first sectionEmpty, UPX0, .xyz (packed/packed)
pe.entry_pointOffset of the entry pointOutside .text section = packer or injected code
pe.exports("DllRegisterServer")Exported function namesMissing expected exports in a legitimate binary
pe.imphash()Import hash (all imported functions)Compare against known-good importhash
pe.is_dll()Is it a DLL?A DLL in a temp directory
pe.is_64bit()64-bit build32-bit binary with a 64-bit compile timestamp

Example — detect UPX-packed files:

rule Detect_UPX_Packed
{
    strings:
        $upx1 = "UPX!"  (UPX signature)
        $upx2 = "UPX0"  (UPX section name)
    condition:
        $upx1 or $upx2
}

Regex Rules — For Pattern Variation

rule Detect_PowerShell_EncodedCommand
{
    strings:
        $re1 = /-EncodedCommand\s+[A-Za-z0-9+\/=]{20,}/
    condition:
        $re1 and pe.is_dll() == false (likely malware, not a script)
}

When to use regex: The string has variable content (e.g., a base64 payload of any length) or a structured pattern (URL, IP address, email). Regex is slower than plain string matching — use sparingly.


Advanced Conditions

ConditionWhat It DoesExample
all of themALL strings must matchcondition: all of them
#s1 > 55 or more occurrences of string $s1condition: #s1 > 5
@s1 < 1000The first occurrence of $s1 is before offset 1000condition: @s1 < 1000
for any i in (1..#s1): ($s1[i] != $s2[i])Each occurrence of $s1 must NOT match $s2Advanced — ensures two different patterns are present
$s1 at 0String appears at offset 0condition: $s1 at 0 (magic bytes)
pe.timestamp > 1672531200PE compile timestamp after Jan 1, 2023condition: pe.timestamp > 1672531200

Hunting Workflow — Using YARA in the SOC

Step 1 — Define the Hypothesis

Before writing any rule, answer: What am I looking for, and where? Integrate with MISP to auto-generate YARA rules from threat intelligence events.

Use CaseHypothesisRule Type
Threat intel IoCs”We received intel that Sample.A uses unique strings in its binary”String match (specific to this sample)
Malware family hunting”LockBit ransomware contains the string lockbit in its command-line output”File-level string match
LOLBin abuse”We want to find all files that decode base64 and execute PowerShell”Path + string match
Process injection”Find processes with executable memory regions containing MZ headers”Memory scan

Step 2 — Write the Rule

Use the rule structure above. Test on known samples first (VirusTotal, Joe Sandbox samples).

Step 3 — Scan (use REMnux for a pre-configured YARA environment)

# Scan a single file
yara my_rule.yar suspicious.exe

# Scan a directory recursively
yara -r my_rule.yar C:\Users\Public\

# Scan a running process by PID
yara -p 1234 my_rule.yar

# Scan all running processes
yara -p 1-9999 my_rule.yar

Step 4 — Validate and Tune

ResultAction
Hit on known-badRule works! Automate scanning.
Hit on known-goodFalse positive. Add exception in condition or use uint32 checks.
No hit on known-badRule is too narrow. Check for alternate strings or hex patterns.
No hits on any fileRule may be too broad or not matching. Check the sample manually.

Integrating YARA into Detection Pipelines

Option 1 — YARA + EDR

Most modern EDRs support YARA scanning. Schedule scans of high-value hosts (Domain Controllers, file servers, admin workstations).

Option 2 — YARA + SIEM

# Cron job: scan new files in a monitored directory every hour
yara -r -m ~/rules/ /share/monitored/ | logger -t yara_scan

Option 3 — YARA + VirusTotal

Submit all detected files to VirusTotal for cross-reference. If VirusTotal agrees (2+ engines detect), escalate to incident.

Sources