Playbooks
T1048, T1052Data Exfiltration Detection
How to detect data leaving your network -- volume anomalies, off-hours transfers, unusual ports and protocols, DNS tunneling exfiltration, and cloud API exfiltration patterns.
View on Graph
What Data Exfiltration Is and Why It Is the Hardest Phase to Detect
Data exfiltration is the unauthorized transfer of data from within the organization to an external destination controlled by the attacker. MITRE ATT&CK maps exfiltration to T1048 (Exfiltration Over Alternative Protocol) and T1052 (Exfiltration Over Physical Medium).
The detection challenge: Legitimate data also leaves the network. Employees email files to partners, upload documents to cloud storage, push code to GitHub, and stream video. Distinguishing malicious exfiltration from legitimate business use requires understanding normal behavioral baselines — volume, timing, protocol, and destination.
Most ransomware attacks now include data exfiltration before encryption (double extortion), and many intrusions never deploy ransomware — the data itself is the objective (intellectual property theft, espionage, credential harvesting).
Exfiltration Method 1: HTTP/HTTPS POST to External Server
How it works: The attacker packages stolen data as a POST request body to a server they control. This is the most common exfiltration method because HTTPS encrypts the payload — the analyst sees a TLS connection but cannot read the contents.
Detection logic:
- Large POST request bodies (multiple MB) from a host that doesn’t normally make large uploads
- POST requests to a newly registered domain or an IP address not associated with any known service
- Consistent request size across multiple POSTs (not typical of web traffic)
- POST requests occurring after hours or from a user who is not normally active at that time
Logs to check:
- Proxy logs: Look for POST requests with large
Content-Lengthheaders, especially to destinations that are not known CDNs or cloud services - Firewall logs: Large outbound TCP connections to a single external IP
- Sysmon Event ID 3: Identify which process is making the connection — a non-browser process making HTTPS POSTs is a strong indicator
SIEM query (SPL):
index=proxy sourcetype=access_combined
method=POST status=200
| eval size_mb = bytes/1048576
| where size_mb > 5
| stats sum(bytes) as total_bytes by src_ip, dest_host, user
| where total_bytes > 100000000
| sort - total_bytes
Shows all hosts sending > 100MB via POST requests — candidates for exfiltration.
Exfiltration Method 2: DNS Tunneling
How it works: The attacker encodes stolen data into DNS queries — typically as subdomains of a domain they control. Since DNS is almost never blocked at firewalls, this method can bypass network controls entirely. The attacker’s DNS server receives the query, decodes the subdomain to extract the data, and sends back a DNS response (possibly containing further commands).
Detection logic:
- High TXT query volume. TXT records have the largest payload capacity of any DNS record type and are the most common carrier for DNS tunneling.
- Unusually long subdomains. A typical subdomain is under 20 characters. DNS tunneling subdomains are often 30-50+ characters of random-looking text (Base64-encoded data).
- High query rate from a single host. A single machine making hundreds of DNS queries per minute to the same domain is suspicious.
- Aberrant record types. Any machine making many TXT, CNAME, or ANY queries is worth investigating — normal clients almost exclusively make A and AAAA queries.
Logs to check:
- DNS query logs: Analyze by
QueryNamelength,QueryTypedistribution, and query rate per client - Sysmon Event ID 22: Shows DNS queries per-process — a non-system process making DNS queries is suspicious
SIEM query (SPL):
index=dns sourcetype=dns
| eval subdomain_len = len(split(query, ".")[0])
| where subdomain_len > 30
| stats count by src_ip, query, query_type
| where count > 10
| sort - count
Shows hosts with long subdomain queries (>30 chars) — DNS tunneling indicator.
Triage decision:
- Is the destination domain registered to your organization? Legitimate CDN/DNS services may use long subdomains
- Does the same host also show large TXT query volume?
- Is the process making these queries a known browser or system service?
Exfiltration Method 3: Cloud API Uploads
How it works: The attacker uses compromised credentials to upload data to cloud storage services (AWS S3, Azure Blob, Google Drive, SharePoint, Dropbox) via API calls. Since these are legitimate services used for business purposes, the traffic blends in.
Detection logic:
- API calls from an unexpected source. A user who never accesses cloud storage via API suddenly making
PutObjectorUploadAPI calls. - Volume anomaly. A single user uploading more data in one hour than the entire department does in a day.
- New API keys. Cloud API calls using an access key that was just created (CreateAccessKey followed by data upload) is a strong indicator.
- After-hours uploads. Bulk uploads starting at 2 AM from a user who never works late.
Logs to check:
- AWS CloudTrail:
PutObject,CopyObject,GetObject(mass reads count as recon for exfiltration) - Azure Activity Log:
Storage Blob Upload,File Upload - GCP Cloud Audit Logs:
storage.objects.create - SaaS audit logs: Google Workspace audit (Drive downloads), Office 365 audit (SharePoint file access)
SIEM query (SPL) — AWS example:
index=cloudtrail eventName=PutObject
| stats count, sum(bytes) as total_bytes by sourceIPAddress, userIdentity.arn, eventSource
| where total_bytes > 50000000
| sort - total_bytes
Shows all PutObject API calls > 50MB — potential cloud exfiltration.
Triage decision:
- Is the API call from a production automation account or a user account? Automation expected; user account doing bulk uploads is not.
- Was the access key recently created? Check CloudTrail for
CreateAccessKeyevents from the same user in the past 24 hours. - Is the bucket/container public or private? Upload to a public bucket is especially concerning.
Exfiltration Method 4: Email Attachments
How it works: The attacker sends data to an external email address via attachment. This is a common exfiltration method because email is expected and rarely blocked outright.
Detection logic:
- Large outbound attachments. Emails with attachments > 10MB sent to external recipients.
- New external recipients. A user sending data to an external domain they have never emailed before.
- After-hours email sends. Bulk email activity at unusual times.
- Zip archives. Password-protected zip files in email bypass DLP scanning.
Logs to check:
- Email gateway logs: Attachment size, recipient domain, sender-user, timestamp
- DLP alerts: Data classification match, sensitive content detection
- Proxy logs: User accessing webmail services (Gmail, Outlook.com) from corporate device — potential unauthorized data transfer
SIEM query (SPL):
index=email sender_domain=yourcompany.com recipient_domain!=yourcompany.com
attachment_size > 10000000
| stats count, sum(attachment_size) as total_bytes by sender, recipient_domain
| sort - total_bytes
Shows all emails with attachments > 10MB sent to external domains.
Exfiltration Method 5: SSH/SFTP/SCP
How it works: The attacker uses SSH or related protocols (SFTP, SCP) to transfer data to an external server. Since SSH is commonly used by IT teams for legitimate administration, detecting malicious use requires contextual analysis.
Detection logic:
- SSH from a server to the internet. SSH outbound from a server that should never initiate SSH connections to external hosts.
- Data volume via SSH. SFTP transfers of unusually large files.
- Long-lived SSH sessions. SSH sessions from internal hosts to external IPs lasting hours.
- SSH to known-bad IPs. Destination IP associated with threat intelligence feeds.
Logs to check:
- Sysmon Event ID 3: Process making network connections —
sshd.exe,ssh.exe,scp.exe, orsftp.execonnecting to an external IP - Firewall logs: Outbound SSH (port 22) traffic to external IPs
- Windows Event 5156: Connection permitted by Windows Filtering Platform
Exfiltration Method 6: ICMP and Other Protocol Tunneling
How it works: Hide data in protocol fields that are not normally inspected — ICMP echo request payloads, HTTP headers, or custom protocols. Tools like pingtunnel or iodine implement this.
Detection logic:
- Large ICMP packets. Normal ping: 32-64 bytes. ICMP tunneling: 1472+ bytes (fragmented).
- High ICMP traffic volume. A host sending hundreds of ping packets to the same external IP.
- Unusual ICMP payload content. Nmap NSE can detect ICMP tunnels by analyzing payload entropy.
Logs to check:
- Firewall logs: ICMP traffic with unusually large packet sizes
- Zeek conn.log: ICMP protocol connections — high count from a single host
- NetFlow/IPFIX: ICMP traffic volume analysis
Exfiltration Detection — Triaging the Finding
When you detect a potential exfiltration event, follow this decision tree:
Potential exfiltration detected
│
├─ What type of data is at risk?
│ ├─ Regulated (PII, PHI, PCI) → Mandatory breach notification check. Escalate to legal.
│ ├─ Proprietary (source code, trade secrets) → IP theft. Involve executive leadership.
│ └─ Operational (credentials, configs) → Account compromise. Reset affected credentials.
│
├─ What is the volume?
│ ├─ > 1GB → Significant exfiltration. Assume data lost. Begin breach notification process.
│ ├─ 100MB-1GB → Moderate. Deep investigation required. Check historical baseline.
│ └─ < 100MB → Investigate further. May be test exfiltration or false positive.
│
├─ What is the timing?
│ ├─ After hours/weekend → Higher confidence. Most exfiltration happens off-hours.
│ └─ Business hours → Check if the user was actually working. Phish-compromised accounts exfiltrate during normal hours.
│
└─ Containment actions:
├─ Block the destination IP/domain at the firewall
├─ Isolate the source host
├─ Disable the compromised account
└─ Begin forensic collection on the source system
Prevention — Controls That Stop Exfiltration
| Control | What It Blocks | Effectiveness |
|---|---|---|
| Egress filtering | Blocks outbound traffic to unauthorized ports/protocols. Stop SSH (22) and DNS (53) except to authorized servers. | High — makes DNS tunneling and SSH exfiltration much harder |
| DNS sinkhole | Blocks resolution of known-malicious domains. Prevents exfiltration to C2 domains. | Medium — only works against known-bad domains |
| DLP (Data Loss Prevention) | Scans outbound traffic for sensitive data patterns (credit cards, SSN, source code). | Medium — bypassed by encryption, compression, and encoding |
| Cloud CASB | Monitors cloud API calls. Alerts on abnormal cloud access patterns. | High for cloud exfiltration specifically |
| UEBA (User Entity Behavior Analytics) | Baselines normal behavior and flags anomalies — unusual data volume, off-hours access, new destinations. | High — catches novel exfiltration methods |
| Network baselining | Know what normal network traffic looks like so anomalies stand out. | Foundational — required for all other controls |
Related
- Kill Chain — covers the kill chain concepts
- MITRE ATT&CK for Triage — covers the mitre att&ck for triage concepts
- Insider Threat — detection and response for T1078 techniques
- Cobalt Strike — Detection and Beacon Analysis — detection and response for T1055, T1572, T1071 techniques
- Active Directory Compromise Response — detection and response for T1558 techniques
