What Is a Cryptographic Hash Function?
A cryptographic hash function must satisfy five properties:
- Deterministic: the same input always produces the same output
- Fast to compute: computing the hash of any input takes negligible time
- Pre-image resistant: given a hash H, it is computationally infeasible to find any input M such that hash(M) = H (one-way)
- Second pre-image resistant: given input M1, it is infeasible to find a different M2 such that hash(M1) = hash(M2)
- Collision resistant: it is infeasible to find any two distinct inputs M1, M2 such that hash(M1) = hash(M2)
These properties together make hash functions useful for integrity checking — if even a single byte of a file changes, its hash changes completely (the "avalanche effect"). SHA-256 changing a single input bit changes approximately 50% of the output bits on average.
Important: hash functions are not encryption. Encryption is reversible with a key; hashing is a one-way transformation. There is no "decrypting" a hash — only brute-force searching for inputs that produce the same hash.
SHA-1 vs SHA-256 vs SHA-384 vs SHA-512: Which to Use
All SHA algorithms are from NIST's Secure Hash Algorithm family. They differ in output length, internal state size, and security level.
| Algorithm | Output | Security level | Status |
|---|---|---|---|
| MD5 | 128 bits / 32 hex chars | Broken | ❌ Do not use for security |
| SHA-1 | 160 bits / 40 hex chars | Broken (2017) | ⚠ Legacy/deprecated only |
| SHA-256 | 256 bits / 64 hex chars | 128-bit security | ✅ Recommended |
| SHA-384 | 384 bits / 96 hex chars | 192-bit security | ✅ Recommended |
| SHA-512 | 512 bits / 128 hex chars | 256-bit security | ✅ Recommended |
SHA-1 was officially broken in 2017 when researchers from Google and CWI Amsterdam produced the first practical SHA-1 collision (the SHAttered attack), creating two distinct PDF files with identical SHA-1 hashes. SHA-1 is still used in legacy Git commit IDs (Git is migrating to SHA-256), some TLS certificate chains, and older S/MIME signatures — but should not be used in any new system where collision resistance matters.
SHA-256 is the right default for the vast majority of use cases: file checksums, HMAC-SHA-256, certificate fingerprints, and blockchain applications. SHA-512 has a marginally larger security margin and is faster than SHA-256 on 64-bit CPUs (it processes more bits per round). SHA-384 is SHA-512 truncated to 384 bits — used in TLS 1.3 cipher suites.
File Integrity Verification with SHA Checksums
File integrity verification is the most common non-cryptographic use of hash functions. The workflow is simple:
- The file's publisher computes
SHA-256(file)and publishes the hash alongside the download link - You download the file and compute its SHA-256 locally
- You compare your hash to the published hash — if they match, the file is intact; if they differ, the file was corrupted or tampered with
This is how Linux distributions, software vendors, and package managers (npm, pip, Cargo) verify download integrity. For example, Ubuntu publishes a SHA256SUMS file alongside every ISO release.
Important nuance: a checksum proves integrity (the file was not altered) but not authenticity (that the file came from the legitimate publisher) unless the checksum itself is signed with a GPG private key. If an attacker can replace both the file and its published checksum, the verification still passes. Always verify that the checksum was published through a trusted channel (HTTPS from the official domain, or a GPG signature you can verify).
HMAC: Hash-Based Message Authentication
HMAC (Hash-based Message Authentication Code, defined in RFC 2104) combines a hash function with a secret key to produce a message authentication code. Unlike a plain hash, an HMAC can only be verified by someone who knows the secret key.
HMAC(key, message) = SHA256((key ⊕ opad) || SHA256((key ⊕ ipad) || message))
HMAC use cases:
- API request signing — AWS Signature Version 4, Stripe webhook verification, and GitHub webhook validation all use HMAC-SHA-256 to prove a request came from an authorised sender
- JWT signatures — the HS256 algorithm in JSON Web Tokens is HMAC-SHA-256
- Cookie integrity — signed cookies include an HMAC of the cookie value to detect tampering
- Password reset tokens — sign a token containing the user ID and expiry with an HMAC to prevent forgery without a database lookup
HMAC is not length-extension attack resistant by construction, but when used correctly (key first, message second, secret key at least 32 bytes) it is secure against all known attacks.
The Critical Mistake: Do NOT Hash Passwords with SHA
Plain SHA hashing (SHA-1, SHA-256, SHA-512) is completely wrong for password storage. This is the single most important point in this guide.
Why SHA is wrong for passwords:
- SHA is fast — that is the problem. A modern GPU can compute billions of SHA-256 hashes per second. If your database is breached, attackers can run dictionary attacks against every hash, trying millions of common passwords per second. A billion-hash-per-second GPU will crack an 8-character password from a 94-character charset in a few hours.
- SHA is deterministic. Two users with the same password produce the same hash, letting an attacker identify shared passwords across users with a single lookup (rainbow table attack).
What to use instead:
| Algorithm | Why it is right for passwords | Recommended settings |
|---|---|---|
| bcrypt | Deliberately slow; auto-salted; cost factor adjustable | cost = 12 (≥ 100 ms on target hardware) |
| Argon2id | Winner of PHC; memory-hard (resists GPU/ASIC); auto-salted | 64 MB memory, 3 iterations, parallelism 4 |
| scrypt | Memory-hard; widely supported | N=32768, r=8, p=1 |
Password hashing algorithms are intentionally slow and memory-hard, making brute-force attacks orders of magnitude more expensive. They also generate a unique random salt per password, so two identical passwords produce different hashes. Use your language's built-in password hashing library — do not implement these yourself.
Code Examples: SHA Hashing in Python, Node.js, Go, and CLI
These examples cover the most common production scenarios: text hashing, file hashing, and HMAC.
import hashlib
import hmac
# Hash a string
text = b"Hello, world!"
sha256 = hashlib.sha256(text).hexdigest()
sha512 = hashlib.sha512(text).hexdigest()
print(f"SHA-256: {sha256}")
print(f"SHA-512: {sha512}")
# Hash a file (streaming — handles large files without loading into memory)
def sha256_file(path: str) -> str:
h = hashlib.sha256()
with open(path, "rb") as f:
for chunk in iter(lambda: f.read(65536), b""):
h.update(chunk)
return h.hexdigest()
print(sha256_file("/etc/hosts"))
# HMAC-SHA256 (e.g. webhook signature verification)
def verify_webhook(payload: bytes, signature: str, secret: str) -> bool:
key = secret.encode()
expected = hmac.new(key, payload, hashlib.sha256).hexdigest()
# Use compare_digest to prevent timing attacks
return hmac.compare_digest(expected, signature)
# Password hashing — use bcrypt, NOT hashlib
# pip install bcrypt
import bcrypt
hashed = bcrypt.hashpw(b"user_password", bcrypt.gensalt(rounds=12))
is_valid = bcrypt.checkpw(b"user_password", hashed)import { createHash, createHmac, timingSafeEqual } from 'node:crypto';
import { createReadStream } from 'node:fs';
// Hash a string
const sha256 = createHash('sha256').update('Hello, world!').digest('hex');
const sha512 = createHash('sha512').update('Hello, world!').digest('hex');
console.log('SHA-256:', sha256);
console.log('SHA-512:', sha512);
// Hash a file (streaming)
function hashFile(path, algorithm = 'sha256') {
return new Promise((resolve, reject) => {
const hash = createHash(algorithm);
createReadStream(path)
.on('data', chunk => hash.update(chunk))
.on('end', () => resolve(hash.digest('hex')))
.on('error', reject);
});
}
// HMAC-SHA256 (webhook verification)
function verifyWebhookSignature(payload, signature, secret) {
const expected = createHmac('sha256', secret)
.update(payload)
.digest('hex');
// Constant-time comparison prevents timing attacks
const a = Buffer.from(expected, 'hex');
const b = Buffer.from(signature, 'hex');
return a.length === b.length && timingSafeEqual(a, b);
}
// Browser: SubtleCrypto API
async function sha256Browser(text) {
const data = new TextEncoder().encode(text);
const hashBuffer = await crypto.subtle.digest('SHA-256', data);
return Array.from(new Uint8Array(hashBuffer))
.map(b => b.toString(16).padStart(2, '0'))
.join('');
}# SHA-256 hash of a string
echo -n "Hello, world!" | sha256sum # Linux
echo -n "Hello, world!" | shasum -a 256 # macOS
# SHA-512 hash of a string
echo -n "Hello, world!" | sha512sum # Linux
echo -n "Hello, world!" | shasum -a 512 # macOS
# Hash a file
sha256sum /path/to/file.iso
shasum -a 256 /path/to/file.iso # macOS
# Verify a file against a known checksum
echo "abc123...expected_hash file.iso" | sha256sum --check
# HMAC-SHA256
echo -n "message" | openssl dgst -sha256 -hmac "secret_key"
# SHA-256 hash of a file using openssl (cross-platform)
openssl dgst -sha256 /path/to/file
# Generate a random 32-byte secret key (for HMAC or similar)
openssl rand -hex 32