Skip to main content
CodeLint.Dev Dev Tools
Developer Tools 10 min read

UUID Complete Guide: Versions, Structure, Database Keys, and When to Use Each

UUIDs (Universally Unique Identifiers) are 128-bit identifiers used everywhere — database primary keys, API resource IDs, distributed system correlation IDs, file names, session tokens. They look simple: 32 hex digits with four dashes. But behind that familiar format are eight distinct versions with radically different properties, and choosing the wrong one can create security vulnerabilities, wreck database index performance, or generate IDs that collide in ways you would never expect. This guide covers the RFC 9562 specification, all eight UUID versions, why UUID v7 has largely replaced v4 for database keys, and production-ready code in four languages.

Try the tool
UUID Generator
Generate UUIDs free →

What Is a UUID? The RFC 9562 Specification

A UUID (Universally Unique Identifier) — also called a GUID (Globally Unique Identifier) in Microsoft contexts — is a 128-bit label defined in RFC 9562 (May 2024), which obsoleted the original RFC 4122 from 2005. The standard text representation is 32 lowercase hexadecimal digits displayed in five groups separated by hyphens:

xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx
                   ↑         ↑
                version   variant
  • The M nibble (first hex digit of the third group) encodes the version (1–8)
  • The N nibble (first hex digit of the fourth group) encodes the variant. All RFC 9562 UUIDs use variant 10xx in binary, meaning N is 8, 9, a, or b.

This structure means you can determine the UUID version and confirm RFC compliance just by inspecting two characters of the string — no parsing library needed.

The 128-bit size means the theoretical maximum UUID space is 2128 = approximately 3.4 × 1038 unique values — far more than could ever be exhausted by any practical system. The question is never "will we run out?" but "what are the collision odds for the version I'm using?"

All 8 UUID Versions: What Each Is For

RFC 9562 defines eight versions. Most applications only need to know about three: v4, v7, and v5.

Version Source of uniqueness Best used for
v1MAC address + 60-bit timestamp (100 ns resolution)Legacy systems; avoid in new code — MAC leaks network identity
v2DCE Security — UID/GID + domain + timestampDCE/POSIX security contexts only; essentially unused today
v3MD5 hash of a namespace UUID + nameDeterministic IDs from names; prefer v5 (MD5 is deprecated)
v4122 bits of CSPRNG randomnessGeneral purpose when sort order does not matter
v5SHA-1 hash of a namespace UUID + nameDeterministic, reproducible IDs from known inputs
v6Reordered v1 timestamp (sortable) + node/clockSortable, MAC-based IDs; migration from v1
v7Unix ms timestamp + random bitsRecommended for database primary keys — sortable + random
v8Custom (application-defined 122-bit payload)Vendor-specific or experimental UUID schemes

The practical decision tree for new projects:

  • Database primary key or any ID that will appear in a B-tree index → v7
  • Deterministic ID generated from a known name or URL → v5
  • Random ID where order does not matter and performance is not critical → v4
  • Migrating a v1-based system → v6
  • Everything else → v4 as the safe default

UUID v4: Structure and Collision Probability

UUID v4 uses 122 bits of cryptographically secure random data (the remaining 6 bits are fixed for version and variant). A v4 UUID looks like:

550e8400-e29b-41d4-a716-446655440000
                   ↑         ↑
              version=4  variant=a (10xx)

Collision probability: For 2122 possible values, generating n UUIDs has an expected collision probability of approximately n²/2123. In concrete terms:

  • 1 billion UUIDs generated: probability of any collision ≈ 1 in 1017 (100 quadrillion)
  • You would need to generate approximately 2.7 × 1018 UUIDs (2.7 quintillion) before the probability of a collision reaches 50%

This is the birthday paradox applied to a 122-bit space. For any real-world system, v4 collision probability is effectively zero — as long as your random number generator is a CSPRNG (Cryptographically Secure Pseudo-Random Number Generator). Never use Math.random() or language-default non-secure random functions to generate UUIDs; use crypto.randomUUID() in browsers/Node.js, uuid.uuid4() in Python, or uuid.New() in Go.

The one real problem with v4: database index fragmentation. Because v4 is random, inserting it into a B-tree primary key index (which MySQL InnoDB and SQL Server use by default) means every insert lands at a random position in the index rather than at the end. This causes page splits, index fragmentation, and poor write performance at scale. At millions of rows this is measurable; at hundreds of millions it becomes a serious problem. This is exactly what v7 solves.

UUID v7: Why It Replaced v4 for Database Keys

UUID v7 (new in RFC 9562) encodes a Unix timestamp in milliseconds in the most significant 48 bits, followed by random bits. This makes v7 UUIDs:

  • Monotonically increasing within the same millisecond (lexicographically sortable)
  • Cluster-friendly — new inserts always go to the end of the index, eliminating B-tree page splits
  • Time-extractable — you can recover the creation timestamp from the UUID itself
  • Still highly random — 74 bits of randomness remain, giving collision resistance equivalent to ULIDs

A v7 UUID generated at 2025-05-24T10:30:00.000Z looks like:

019706b4-3800-7xxx-8xxx-xxxxxxxxxxxx
│──────────────┘
48-bit Unix ms timestamp (0x019706b43800 = 1748083800000 ms)

Benchmark comparison (PostgreSQL, 10M rows, UUID primary key):

UUID type Bulk insert time Index size
BIGINT AUTOINCREMENTBaselineBaseline
UUID v4 (random)3–5× slower~30% larger (fragmentation)
UUID v7 (time-ordered)~1.1× slower~5% larger

PostgreSQL 17+ includes native gen_random_uuid() returning v4, and uuid_generate_v7() via the pgcrypto extension. MySQL 8.4+ added UUID_TO_BIN(UUID(), 1) which reorders bytes for better index locality — v7 is the cleaner modern equivalent.

UUID v5: Deterministic IDs from Names

UUID v5 generates a deterministic UUID from a namespace UUID and a name string using SHA-1. The same namespace + name always produces the same UUID, on any machine, at any time — no coordination required.

RFC 9562 defines four standard namespace UUIDs:

  • 6ba7b810-9dad-11d1-80b4-00c04fd430c8DNS (for domain names)
  • 6ba7b811-9dad-11d1-80b4-00c04fd430c8URL (for URLs)
  • 6ba7b812-9dad-11d1-80b4-00c04fd430c8OID (for ISO OIDs)
  • 6ba7b814-9dad-11d1-80b4-00c04fd430c8X.500 DN (for directory names)

Use cases for v5:

  • Generating a stable ID for a URL or external resource without a round-trip to a database
  • Deduplication: two systems independently generating a UUID for the same URL get the same ID
  • Idempotent event IDs in event-driven systems: retrying the same operation produces the same event ID
  • Sharding: deterministically mapping a known input to a shard without a lookup table

Code Examples: UUID Generation in Python, Node.js, Go, and SQL

Always use platform-provided cryptographically secure UUID implementations — never roll your own.

Python Python (stdlib uuid module)
import uuid

# UUID v4 — random, general purpose
v4 = uuid.uuid4()
print(v4)           # e.g. 550e8400-e29b-41d4-a716-446655440000

# UUID v5 — deterministic from namespace + name
NS_URL = uuid.NAMESPACE_URL
page_id = uuid.uuid5(NS_URL, 'https://codelint.dev/utility/uuid-generator')
print(page_id)      # always the same for this URL

# UUID v1 — time + MAC (avoid in new code)
v1 = uuid.uuid1()
print(v1)

# Extract timestamp from a v1 UUID (100-ns intervals since 1582-10-15)
import datetime
ts_100ns = v1.time
epoch_diff = 0x01b21dd213814000   # 100-ns intervals from 1582-10-15 to 1970-01-01
unix_ts_ns = (ts_100ns - epoch_diff) * 100
created_at = datetime.datetime.fromtimestamp(unix_ts_ns / 1e9, tz=datetime.timezone.utc)
print(created_at)

# Store as bytes (16 bytes) instead of string (36 bytes) for database columns
uuid_bytes = v4.bytes    # 16 bytes
uuid_str   = str(v4)     # 36 characters
restored   = uuid.UUID(bytes=uuid_bytes)
print(restored == v4)    # True
JavaScript Node.js / Browser
import { randomUUID } from 'node:crypto';  // Node.js 14.17+

// UUID v4 — built into Node.js and all modern browsers
const v4 = randomUUID();
console.log(v4); // crypto-secure CSPRNG, never Math.random()

// Browser equivalent (no import needed)
// const v4 = crypto.randomUUID();

// UUID v5 — deterministic (no stdlib support; use 'uuid' package)
import { v5 as uuidv5, v7 as uuidv7 } from 'uuid';

const NS_URL = '6ba7b811-9dad-11d1-80b4-00c04fd430c8';
const pageId = uuidv5('https://codelint.dev', NS_URL);
console.log(pageId); // same every time

// UUID v7 — time-sortable (Node.js 'uuid' package v9+)
const v7 = uuidv7();
console.log(v7);

// Extract timestamp from a v7 UUID
function extractV7Timestamp(uuid) {
  const hex = uuid.replace(/-/g, '').slice(0, 12);
  return parseInt(hex, 16); // Unix milliseconds
}
console.log(new Date(extractV7Timestamp(v7)));
Go Go (google/uuid)
package main

import (
	"fmt"
	"github.com/google/uuid"
)

func main() {
	// UUID v4 — random
	v4 := uuid.New()
	fmt.Println(v4)

	// UUID v5 — deterministic
	pageID := uuid.NewSHA1(uuid.NameSpaceURL, []byte("https://codelint.dev"))
	fmt.Println(pageID) // same every time

	// UUID v7 — time-sortable (google/uuid v1.6+)
	v7, err := uuid.NewV7()
	if err != nil {
		panic(err)
	}
	fmt.Println(v7)

	// Parse and validate
	parsed, err := uuid.Parse("550e8400-e29b-41d4-a716-446655440000")
	if err != nil {
		fmt.Println("invalid UUID")
	} else {
		fmt.Println("version:", parsed.Version()) // 4
	}
}
SQL (PostgreSQL) SQL (PostgreSQL)
-- UUID v4 built-in (PostgreSQL 13+)
SELECT gen_random_uuid();

-- UUID primary key column (string storage — simple, readable)
CREATE TABLE orders (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    amount NUMERIC(12, 2) NOT NULL
);

-- UUID stored as BINARY(16) for maximum efficiency (MySQL)
-- Saves 20 bytes per row compared to CHAR(36)
CREATE TABLE orders (
    id BINARY(16) PRIMARY KEY DEFAULT (UUID_TO_BIN(UUID(), 1)),
    -- UUID_TO_BIN with swap_flag=1 reorders bytes for index locality
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    amount DECIMAL(12, 2) NOT NULL
);

-- Retrieve readable UUID from binary column (MySQL)
SELECT BIN_TO_UUID(id, 1) as id, created_at, amount FROM orders;

Common UUID Mistakes

  1. Using v1 in security-sensitive contexts. UUID v1 embeds the MAC address of the generating machine. An attacker who obtains a v1 UUID can determine the MAC address of your server, which can aid network reconnaissance. If you expose v1 UUIDs publicly (as API resource IDs, in URLs), use v4 or v7 instead.
  2. Generating with Math.random() or non-CSPRNG sources. Math.random() is not a cryptographically secure RNG. A UUID built from it is predictable by an attacker who can observe other UUIDs, because the PRNG state can be reconstructed. Always use crypto.randomUUID() (browser/Node.js), uuid.uuid4() (Python), or uuid.New() (Go).
  3. Storing UUIDs as CHAR(36) when performance matters. A UUID stored as a 36-character string uses 36 bytes. Stored as BINARY(16), it uses 16 bytes — 55% smaller. For tables with hundreds of millions of rows and UUID primary keys, this difference affects index size, cache efficiency, and I/O. Use BINARY(16) in MySQL; PostgreSQL has a native uuid type that handles this automatically.
  4. Using v4 for database primary keys at scale. Random v4 UUIDs cause B-tree index fragmentation because each insert lands at a random position rather than the end. Use v7 (time-ordered) for any UUID that will be a database primary key. The sequential nature of v7 keeps index pages full and insert performance close to integer autoincrement.
  5. Case sensitivity bugs. UUIDs are case-insensitive by spec, but application code often compares them case-sensitively. Normalise all UUIDs to lowercase at ingestion and before any comparison. Store and display in a single consistent case — lowercase is the RFC convention.
  6. Confusing nil UUID with missing data. The nil UUID 00000000-0000-0000-0000-000000000000 is a valid UUID defined in RFC 9562. Do not use it as a sentinel for "no value" in your database or API — use SQL NULL or an explicit optional type in your application code. The nil UUID appearing in production data is a sign of an uninitialized ID being stored rather than rejected.

Frequently Asked Questions

What is the difference between UUID v4 and UUID v7?
UUID v4 is 122 bits of pure random data — highly unique but not sortable and bad for database B-tree indexes because random inserts cause page splits. UUID v7 encodes a 48-bit Unix millisecond timestamp in the high bits followed by 74 bits of randomness, making it both time-sortable and highly unique. For database primary keys, v7 is strongly preferred because sequential inserts maintain index locality. For everything else (API tokens, session IDs, non-indexed identifiers), v4 is a fine default.
Can two UUID v4 values ever be the same?
Theoretically yes, but the probability is astronomically small. UUID v4 has 122 random bits (2^122 ≈ 5.3 × 10^36 possible values). To reach a 50% chance of any collision, you would need to generate approximately 2.7 × 10^18 UUIDs. At 1 billion UUIDs per second, that would take about 85 years. For any real-world system, UUID v4 collisions from a proper CSPRNG are effectively impossible.
What is UUID v5 used for?
UUID v5 generates a deterministic UUID from a namespace UUID and a name string using SHA-1. The same namespace + name always produces the same UUID, on any machine. This is useful for: deduplication (two systems generate the same ID for the same URL), idempotent operations (retrying the same event produces the same event ID), and content-addressable systems where the ID should be derived from the content itself. Use v5 over v3 (which uses MD5, a deprecated algorithm).
Should I use UUID or BIGINT AUTOINCREMENT for database primary keys?
BIGINT AUTOINCREMENT is simpler and has better index performance, but requires a centralised sequence generator — which is a bottleneck in distributed systems and complicates database sharding. UUID v7 gives you globally unique IDs without coordination, good index performance (near-sequential), and the creation timestamp embedded in the ID. The tradeoff: UUIDs use 16 bytes (vs 8 for BIGINT) and are less human-readable. For microservices and distributed databases, UUID v7 is the modern standard.
Is it safe to expose UUIDs in URLs and API responses?
UUID v4 and v7 are safe to expose — they reveal nothing about your system (unlike v1, which embeds a MAC address). However, exposing an ID in a URL does not make the resource secure. Always enforce authorisation checks: a user knowing a UUID should not automatically grant access to the resource it identifies. UUIDs in URLs are not a substitute for proper access control.
What is the nil UUID and when should I use it?
The nil UUID is 00000000-0000-0000-0000-000000000000 (all zeros). It is defined in RFC 9562 as a special-case UUID representing "no value" in contexts where the UUID type is mandatory. In databases, it is almost always better to use NULL instead, because database NULL correctly propagates through JOINs and IS NULL queries while the nil UUID would behave as a regular row value. Reserve the nil UUID for contexts where NULL is not available (e.g. Protocol Buffers message fields, binary formats that require a fixed-length UUID).

Ready to try UUID Generator?

Free, private, and runs entirely in your browser — no sign-up, no server, no data sent anywhere.

Open UUID Generator