What Is a UUID? The RFC 9562 Specification
A UUID (Universally Unique Identifier) — also called a GUID (Globally Unique Identifier) in Microsoft contexts — is a 128-bit label defined in RFC 9562 (May 2024), which obsoleted the original RFC 4122 from 2005. The standard text representation is 32 lowercase hexadecimal digits displayed in five groups separated by hyphens:
xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx
↑ ↑
version variant
- The M nibble (first hex digit of the third group) encodes the version (1–8)
- The N nibble (first hex digit of the fourth group) encodes the variant. All RFC 9562 UUIDs use variant
10xxin binary, meaning N is 8, 9, a, or b.
This structure means you can determine the UUID version and confirm RFC compliance just by inspecting two characters of the string — no parsing library needed.
The 128-bit size means the theoretical maximum UUID space is 2128 = approximately 3.4 × 1038 unique values — far more than could ever be exhausted by any practical system. The question is never "will we run out?" but "what are the collision odds for the version I'm using?"
All 8 UUID Versions: What Each Is For
RFC 9562 defines eight versions. Most applications only need to know about three: v4, v7, and v5.
| Version | Source of uniqueness | Best used for |
|---|---|---|
| v1 | MAC address + 60-bit timestamp (100 ns resolution) | Legacy systems; avoid in new code — MAC leaks network identity |
| v2 | DCE Security — UID/GID + domain + timestamp | DCE/POSIX security contexts only; essentially unused today |
| v3 | MD5 hash of a namespace UUID + name | Deterministic IDs from names; prefer v5 (MD5 is deprecated) |
| v4 | 122 bits of CSPRNG randomness | General purpose when sort order does not matter |
| v5 | SHA-1 hash of a namespace UUID + name | Deterministic, reproducible IDs from known inputs |
| v6 | Reordered v1 timestamp (sortable) + node/clock | Sortable, MAC-based IDs; migration from v1 |
| v7 | Unix ms timestamp + random bits | Recommended for database primary keys — sortable + random |
| v8 | Custom (application-defined 122-bit payload) | Vendor-specific or experimental UUID schemes |
The practical decision tree for new projects:
- Database primary key or any ID that will appear in a B-tree index → v7
- Deterministic ID generated from a known name or URL → v5
- Random ID where order does not matter and performance is not critical → v4
- Migrating a v1-based system → v6
- Everything else → v4 as the safe default
UUID v4: Structure and Collision Probability
UUID v4 uses 122 bits of cryptographically secure random data (the remaining 6 bits are fixed for version and variant). A v4 UUID looks like:
550e8400-e29b-41d4-a716-446655440000
↑ ↑
version=4 variant=a (10xx)
Collision probability: For 2122 possible values, generating n UUIDs has an expected collision probability of approximately n²/2123. In concrete terms:
- 1 billion UUIDs generated: probability of any collision ≈ 1 in 1017 (100 quadrillion)
- You would need to generate approximately 2.7 × 1018 UUIDs (2.7 quintillion) before the probability of a collision reaches 50%
This is the birthday paradox applied to a 122-bit space. For any real-world system, v4 collision probability is effectively zero — as long as your random number generator is a CSPRNG (Cryptographically Secure Pseudo-Random Number Generator). Never use Math.random() or language-default non-secure random functions to generate UUIDs; use crypto.randomUUID() in browsers/Node.js, uuid.uuid4() in Python, or uuid.New() in Go.
The one real problem with v4: database index fragmentation. Because v4 is random, inserting it into a B-tree primary key index (which MySQL InnoDB and SQL Server use by default) means every insert lands at a random position in the index rather than at the end. This causes page splits, index fragmentation, and poor write performance at scale. At millions of rows this is measurable; at hundreds of millions it becomes a serious problem. This is exactly what v7 solves.
UUID v7: Why It Replaced v4 for Database Keys
UUID v7 (new in RFC 9562) encodes a Unix timestamp in milliseconds in the most significant 48 bits, followed by random bits. This makes v7 UUIDs:
- Monotonically increasing within the same millisecond (lexicographically sortable)
- Cluster-friendly — new inserts always go to the end of the index, eliminating B-tree page splits
- Time-extractable — you can recover the creation timestamp from the UUID itself
- Still highly random — 74 bits of randomness remain, giving collision resistance equivalent to ULIDs
A v7 UUID generated at 2025-05-24T10:30:00.000Z looks like:
019706b4-3800-7xxx-8xxx-xxxxxxxxxxxx │──────────────┘ 48-bit Unix ms timestamp (0x019706b43800 = 1748083800000 ms)
Benchmark comparison (PostgreSQL, 10M rows, UUID primary key):
| UUID type | Bulk insert time | Index size |
|---|---|---|
| BIGINT AUTOINCREMENT | Baseline | Baseline |
| UUID v4 (random) | 3–5× slower | ~30% larger (fragmentation) |
| UUID v7 (time-ordered) | ~1.1× slower | ~5% larger |
PostgreSQL 17+ includes native gen_random_uuid() returning v4, and uuid_generate_v7() via the pgcrypto extension. MySQL 8.4+ added UUID_TO_BIN(UUID(), 1) which reorders bytes for better index locality — v7 is the cleaner modern equivalent.
UUID v5: Deterministic IDs from Names
UUID v5 generates a deterministic UUID from a namespace UUID and a name string using SHA-1. The same namespace + name always produces the same UUID, on any machine, at any time — no coordination required.
RFC 9562 defines four standard namespace UUIDs:
6ba7b810-9dad-11d1-80b4-00c04fd430c8— DNS (for domain names)6ba7b811-9dad-11d1-80b4-00c04fd430c8— URL (for URLs)6ba7b812-9dad-11d1-80b4-00c04fd430c8— OID (for ISO OIDs)6ba7b814-9dad-11d1-80b4-00c04fd430c8— X.500 DN (for directory names)
Use cases for v5:
- Generating a stable ID for a URL or external resource without a round-trip to a database
- Deduplication: two systems independently generating a UUID for the same URL get the same ID
- Idempotent event IDs in event-driven systems: retrying the same operation produces the same event ID
- Sharding: deterministically mapping a known input to a shard without a lookup table
Code Examples: UUID Generation in Python, Node.js, Go, and SQL
Always use platform-provided cryptographically secure UUID implementations — never roll your own.
import uuid
# UUID v4 — random, general purpose
v4 = uuid.uuid4()
print(v4) # e.g. 550e8400-e29b-41d4-a716-446655440000
# UUID v5 — deterministic from namespace + name
NS_URL = uuid.NAMESPACE_URL
page_id = uuid.uuid5(NS_URL, 'https://codelint.dev/utility/uuid-generator')
print(page_id) # always the same for this URL
# UUID v1 — time + MAC (avoid in new code)
v1 = uuid.uuid1()
print(v1)
# Extract timestamp from a v1 UUID (100-ns intervals since 1582-10-15)
import datetime
ts_100ns = v1.time
epoch_diff = 0x01b21dd213814000 # 100-ns intervals from 1582-10-15 to 1970-01-01
unix_ts_ns = (ts_100ns - epoch_diff) * 100
created_at = datetime.datetime.fromtimestamp(unix_ts_ns / 1e9, tz=datetime.timezone.utc)
print(created_at)
# Store as bytes (16 bytes) instead of string (36 bytes) for database columns
uuid_bytes = v4.bytes # 16 bytes
uuid_str = str(v4) # 36 characters
restored = uuid.UUID(bytes=uuid_bytes)
print(restored == v4) # Trueimport { randomUUID } from 'node:crypto'; // Node.js 14.17+
// UUID v4 — built into Node.js and all modern browsers
const v4 = randomUUID();
console.log(v4); // crypto-secure CSPRNG, never Math.random()
// Browser equivalent (no import needed)
// const v4 = crypto.randomUUID();
// UUID v5 — deterministic (no stdlib support; use 'uuid' package)
import { v5 as uuidv5, v7 as uuidv7 } from 'uuid';
const NS_URL = '6ba7b811-9dad-11d1-80b4-00c04fd430c8';
const pageId = uuidv5('https://codelint.dev', NS_URL);
console.log(pageId); // same every time
// UUID v7 — time-sortable (Node.js 'uuid' package v9+)
const v7 = uuidv7();
console.log(v7);
// Extract timestamp from a v7 UUID
function extractV7Timestamp(uuid) {
const hex = uuid.replace(/-/g, '').slice(0, 12);
return parseInt(hex, 16); // Unix milliseconds
}
console.log(new Date(extractV7Timestamp(v7)));package main
import (
"fmt"
"github.com/google/uuid"
)
func main() {
// UUID v4 — random
v4 := uuid.New()
fmt.Println(v4)
// UUID v5 — deterministic
pageID := uuid.NewSHA1(uuid.NameSpaceURL, []byte("https://codelint.dev"))
fmt.Println(pageID) // same every time
// UUID v7 — time-sortable (google/uuid v1.6+)
v7, err := uuid.NewV7()
if err != nil {
panic(err)
}
fmt.Println(v7)
// Parse and validate
parsed, err := uuid.Parse("550e8400-e29b-41d4-a716-446655440000")
if err != nil {
fmt.Println("invalid UUID")
} else {
fmt.Println("version:", parsed.Version()) // 4
}
}-- UUID v4 built-in (PostgreSQL 13+)
SELECT gen_random_uuid();
-- UUID primary key column (string storage — simple, readable)
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
amount NUMERIC(12, 2) NOT NULL
);
-- UUID stored as BINARY(16) for maximum efficiency (MySQL)
-- Saves 20 bytes per row compared to CHAR(36)
CREATE TABLE orders (
id BINARY(16) PRIMARY KEY DEFAULT (UUID_TO_BIN(UUID(), 1)),
-- UUID_TO_BIN with swap_flag=1 reorders bytes for index locality
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
amount DECIMAL(12, 2) NOT NULL
);
-- Retrieve readable UUID from binary column (MySQL)
SELECT BIN_TO_UUID(id, 1) as id, created_at, amount FROM orders;Common UUID Mistakes
- Using v1 in security-sensitive contexts. UUID v1 embeds the MAC address of the generating machine. An attacker who obtains a v1 UUID can determine the MAC address of your server, which can aid network reconnaissance. If you expose v1 UUIDs publicly (as API resource IDs, in URLs), use v4 or v7 instead.
- Generating with Math.random() or non-CSPRNG sources.
Math.random()is not a cryptographically secure RNG. A UUID built from it is predictable by an attacker who can observe other UUIDs, because the PRNG state can be reconstructed. Always usecrypto.randomUUID()(browser/Node.js),uuid.uuid4()(Python), oruuid.New()(Go). - Storing UUIDs as CHAR(36) when performance matters. A UUID stored as a 36-character string uses 36 bytes. Stored as BINARY(16), it uses 16 bytes — 55% smaller. For tables with hundreds of millions of rows and UUID primary keys, this difference affects index size, cache efficiency, and I/O. Use BINARY(16) in MySQL; PostgreSQL has a native
uuidtype that handles this automatically. - Using v4 for database primary keys at scale. Random v4 UUIDs cause B-tree index fragmentation because each insert lands at a random position rather than the end. Use v7 (time-ordered) for any UUID that will be a database primary key. The sequential nature of v7 keeps index pages full and insert performance close to integer autoincrement.
- Case sensitivity bugs. UUIDs are case-insensitive by spec, but application code often compares them case-sensitively. Normalise all UUIDs to lowercase at ingestion and before any comparison. Store and display in a single consistent case — lowercase is the RFC convention.
- Confusing nil UUID with missing data. The nil UUID
00000000-0000-0000-0000-000000000000is a valid UUID defined in RFC 9562. Do not use it as a sentinel for "no value" in your database or API — use SQL NULL or an explicit optional type in your application code. The nil UUID appearing in production data is a sign of an uninitialized ID being stored rather than rejected.