Anatomy: What Every Region of a QR Code Does
A QR code is a grid of black and white modules (the "pixels"), and a surprising share of them carry no data at all — they exist to make scanning fast and orientation-proof:
- Finder patterns — the three large squares in the corners. Their genius is the ratio: any line through the center crosses black-white-black-white-black in the proportion 1:1:3:1:1, a signature that holds at any rotation, any scale, and even mirrored. This is why scanners lock on in milliseconds from any angle — the detection problem was solved geometrically, not computationally.
- Timing patterns — the alternating black/white lines connecting the finder squares. They tell the decoder the exact module grid spacing, compensating for perspective distortion and print stretching.
- Alignment patterns — smaller squares (absent in version 1, increasingly numerous in larger versions) that correct local warping — the curved-cup problem.
- Format information — 15 bits around the finder patterns declaring the error-correction level and mask pattern, themselves protected by their own error-correcting code (the decoder must read this before it can read anything else).
- Quiet zone — the mandatory white border (4 modules wide) that separates the code from surrounding clutter. Most "my QR code won't scan" complaints trace to designers shrinking or coloring this margin.
- Data + error correction codewords — everything else, filled in a zigzag from the bottom-right corner upward, in 8-bit codewords.
Versions set the grid size: version 1 is 21×21 modules, each version adds 4, up to version 40 at 177×177 — which holds up to 2,953 bytes, or 7,089 digits. The dense codes on shipping labels and payment posters are simply higher versions; the sparse ones on business cards encoding a short URL are version 2–4.
The Four Encoding Modes: Why Content Type Changes Code Size
QR codes do not store "text" — they store bits, and the standard defines four modes with radically different efficiency:
| Mode | Character set | Bits per char | Max capacity (v40, level L) |
|---|---|---|---|
| Numeric | 0–9 | ~3.33 | 7,089 digits |
| Alphanumeric | 0–9, A–Z (upper only), space $%*+−./: | ~5.5 | 4,296 chars |
| Byte | Any 8-bit data (UTF-8 in practice) | 8 | 2,953 bytes |
| Kanji | Shift-JIS double-byte characters | 13 | 1,817 chars |
The practical consequences generators quietly handle for you:
- Numeric mode packs 3 digits into 10 bits — why a long invoice number makes a much sparser code than the same-length word.
- Case matters: "HTTPS://EXAMPLE.COM/ABC" fits alphanumeric mode; "https://example.com/abc" forces byte mode — noticeably denser. (Domains are case-insensitive, so uppercase-domain tricks genuinely shrink codes; URL paths, however, are case-sensitive.)
- Good encoders switch modes mid-stream — encoding a URL's letters in byte mode and a long trailing ID in numeric mode, each segment with its own mode header.
- Shorter is better everywhere: every character removed lowers the version needed, which grows modules, which scans farther and faster. This is the engineering case for putting a short redirect URL in the code rather than a 300-character tracking URL.
Reed-Solomon: The Math That Survives a Hole Through the Code
The signature feature — scanning through damage — comes from Reed-Solomon error correction, the same code family protecting CDs, DVDs, and deep-space transmissions. Conceptually: your data bytes are treated as coefficients of a polynomial, which is evaluated to produce extra parity codewords. Because a polynomial of degree k is fully determined by any k+1 points, the decoder can reconstruct the original data from a subset of the codewords — it does not need them all.
QR offers four correction levels, a direct trade between capacity and survivability:
| Level | Recoverable damage | Typical use |
|---|---|---|
| L (Low) | ~7% | Clean digital display, maximum capacity |
| M (Medium) | ~15% | The default for general printing |
| Q (Quartile) | ~25% | Industrial environments, dirt and wear |
| H (High) | ~30% | Logo-in-the-middle designs, outdoor weathering |
That logo in the middle of branded QR codes? It is not "part of" the code — it is deliberate destruction that level H error correction repairs on every single scan. The dots the logo covers are simply lost, and Reed-Solomon reconstructs them. This also explains the design rule: logo area must stay well under the correction budget (under ~25% for level H, in practice under 20% to leave margin for print imperfections — which consume the same budget).
One more hidden mechanism: masking. Before output, the encoder XORs the data region with one of 8 checkerboard-like patterns and scores each result, picking the mask that best breaks up large same-color blobs and avoids fake finder-pattern shapes (1:1:3:1:1 runs appearing in data would confuse scanners). The winning mask's ID is stored in the format information. It is why the same URL generates subtly different-looking codes across generators — both are correct.
Quishing: The Security Problem Printed on Every Table
QR codes have one structural security flaw: humans cannot read them. A poisoned code looks identical to a legitimate one, and that asymmetry built a whole attack category — "quishing" (QR phishing) — that grew rapidly through the mid-2020s:
- The sticker attack. Print malicious codes, stick them over legitimate ones — on parking meters, restaurant tables, EV chargers, charity posters. Victims "pay the parking meter" into an attacker's clone site. Documented repeatedly across US, UK, and European cities.
- Email quishing. A QR code image in a phishing email ("scan to review the voicemail / MFA update") moves the victim from their corporate laptop — with URL filtering — to their personal phone, which has none. This detour around enterprise security is precisely why attackers prefer the QR over a link.
- Fake payment codes. In payment-QR economies (UPI in India, Pix in Brazil, Alipay/WeChat Pay in China), swapped merchant codes redirect customer payments. A useful distinction most users never learn: scanning a static merchant code and approving a payment sends money; anyone asking you to scan a code "to receive money" is running a script — receiving never requires approving.
Defenses, in order of realism:
- Read the preview URL — every modern phone shows the destination before opening. Check the actual domain (not the subdomain: paypal.com.evil.example is evil.example), be suspicious of URL shorteners on physical posters, and treat "login pages" reached via QR like any unsolicited login link: navigate manually instead.
- Physical inspection for stickers-over-stickers on payment infrastructure — crooked edges and raised surfaces are the tell.
- For issuers: use your own domain (never a generic shortener), print the human-readable URL next to the code, and prefer dynamic codes you can audit and redirect if compromised.
Generating QR Codes Well: The Practical Checklist
Everything above condenses into a short list that separates codes that scan first-time from codes that frustrate:
- ✅ Keep the payload short. Short URL → lower version → bigger modules → scans farther, faster, smaller. If you need tracking parameters, put them behind a redirect on your own domain.
- ✅ Match error correction to context: M for clean print, H if a logo or outdoor weathering is involved — and keep logo coverage under ~20%.
- ✅ Respect the quiet zone — 4 modules of clear margin, same color as the background. The most-violated rule in graphic design.
- ✅ Contrast: dark-on-light, always. Scanners assume dark modules on a light background; inverted codes fail on many readers. Keep contrast high — pastel-on-white photographs poorly under real lighting.
- ✅ Size for distance: rule of thumb, scanning distance ÷ 10 = minimum code width. A poster scanned from 3 meters needs a ~30 cm code; a business card scanned at 25 cm needs 2.5 cm.
- ✅ Test on real phones — old Android and iPhone, low light, at angle, after printing (not just on screen). Print processes eat fine detail that looked crisp in the PDF.
- ✅ Mind privacy when generating: QR payloads are often sensitive (Wi-Fi credentials, payment strings, vCards). A generator that renders entirely client-side — in your browser, nothing uploaded — means the payload never touches someone else's server.
The QR code is a rare artifact: a 1990s industrial spec that became daily infrastructure for billions without ever needing a version 2 of its core design. The squares earned it.