Technical Guide March 14, 2026

How QR Codes Work — A Technical Guide

QR codes are everywhere — restaurant tables, boarding passes, payment terminals, factory floors. But what's actually happening inside that square grid of black and white dots? This guide breaks down the encoding process, the physical structure, and the math that makes QR codes surprisingly hard to break.

The Anatomy of a QR Code

Every QR code is built from a fixed set of structural components. If you look closely at any code, you'll spot them immediately.

Finder patterns

The three large squares in the top-left, top-right, and bottom-left corners are finder patterns. Each one is a 7×7 module block: a 3×3 dark center surrounded by a light ring, surrounded by a dark ring. This specific ratio — 1:1:3:1:1 — is designed to be recognizable no matter how the scanner encounters it. Rotate the code 90°, flip it, scan it at an angle — the finder patterns still register.

Why three instead of four? Three points define a plane. The scanner calculates the fourth corner mathematically, and the gap in the bottom-right is used for alignment and format information instead.

Timing patterns

Two single-module-wide lines run between the finder patterns — one horizontal (row 6) and one vertical (column 6). They alternate strictly: dark, light, dark, light. The scanner uses these to determine the exact module grid spacing, which matters when the code is printed at odd sizes or on curved surfaces.

Alignment patterns

Codes from version 2 onward include alignment patterns — small 5×5 targets scattered across the data area. Version 2 has one. Version 7 has six. Version 40 has 46. They correct for perspective distortion, like when someone scans a code printed on a coffee cup or a wrinkled poster. The positions are defined in a lookup table in the QR code spec (ISO/IEC 18004), and they're placed so they don't overlap with finder patterns.

Format and version information

Two strips of 15 bits near the finder patterns store the error correction level and mask pattern. These are encoded twice for redundancy. Codes version 7 and above also embed an 18-bit version number in two locations, so the scanner knows the grid dimensions before it finishes reading the data area.

Data and error correction modules

Everything else — all the remaining modules after the structural elements are placed — stores the actual encoded data interleaved with error correction codewords. The data fills the grid in a specific zigzag pattern: two-module-wide columns, read bottom to top, then top to bottom, snaking right to left across the code.

Four Encoding Modes

QR codes don't treat all data the same. The spec defines four encoding modes, each optimized for a different character set. The generator picks the mode that produces the smallest output — or splits the data across modes when that's more efficient.

Mode	Characters	Bits per char	Max capacity (V40-L)
Numeric	0–9	3.33	7,089 digits
Alphanumeric	0–9, A–Z, space, $%*+-./:	5.5	4,296 chars
Byte	Any (ISO 8859-1 / UTF-8)	8	2,953 bytes
Kanji	Shift JIS double-byte	13	1,817 chars

Numeric mode

Digits only. The encoder groups them into triplets, converts each triplet to a 10-bit binary value, and packs them tightly. A trailing pair gets 7 bits; a single trailing digit gets 4 bits. This is why phone numbers and serial numbers produce such compact codes — three characters fit into just 10 bits instead of the 24 bits byte mode would need.

Alphanumeric mode

Covers uppercase letters, digits, and nine special characters: $ % * + - . / : space. Note the absence of lowercase letters. Characters are paired, each pair mapped to an 11-bit value using a 45-character lookup table. A URL like HTTPS://EXAMPLE.COM (all uppercase) encodes in alphanumeric mode. But https://example.com falls through to byte mode because of the lowercase letters — and uses 45% more space.

Practical tip: If you control the URL, consider using an all-uppercase domain with a case-insensitive redirect. HTTPS://EXAMPLE.COM/GO encodes in alphanumeric mode and produces a visibly smaller QR code than its lowercase equivalent.

Byte mode

The fallback. Any byte value from 0x00 to 0xFF goes through at 8 bits per character. Most URLs, JSON strings, and arbitrary text end up here. UTF-8 multibyte characters work fine — a 3-byte UTF-8 character simply costs 24 bits.

Kanji mode

Specific to Shift JIS encoded Japanese characters. Each double-byte character is subtracted from a base value, compressed, and stored in 13 bits — a savings of 3 bits over byte mode per character. Outside Japan, you'll rarely encounter this mode.

A single QR code can mix modes. The data stream includes mode indicators (4-bit headers) and character count fields that tell the scanner when one mode ends and another begins. A smart encoder might use alphanumeric mode for the protocol and domain of a URL, then switch to byte mode for the path.

Reed-Solomon Error Correction

This is the engineering that makes QR codes work in the real world. Printed codes get scratched, rained on, partially covered by stickers, and faded by sunlight. Reed-Solomon error correction lets the scanner reconstruct missing data from the surviving modules.

The math works over GF(256) — a Galois field with 256 elements, where each element corresponds to one byte. The encoder treats the data as coefficients of a polynomial, divides it by a generator polynomial, and appends the remainder as error correction codewords. The scanner then evaluates the polynomial, detects error positions using the Berlekamp-Massey algorithm, and corrects them with Forney's algorithm.

Here's what that means in practice:

EC Level	Recovery	Overhead	Best for
L (Low)	~7%	~20% more modules	Digital screens, clean environments
M (Medium)	~15%	~38% more modules	General printing (default for most generators)
Q (Quartile)	~25%	~55% more modules	Outdoor signage, industrial labels
H (High)	~30%	~65% more modules	Logo overlays, harsh environments

The recovery percentages refer to codewords, not individual modules. A single corrupted module might affect only one codeword, or it might damage two if it sits on a codeword boundary. In practice, level H can survive a logo covering roughly 20% of the code's center area — though you should always test your codes before printing.

The tradeoff is real: Level H needs about 65% more modules than level L for the same data. That means a bigger, denser code — harder to scan at small sizes or from a distance. Don't default to level H "just in case." Use M for most applications and bump up only when you know the code will take damage.

Versions 1 Through 40

The QR code spec defines 40 versions. Version 1 is a 21×21 module grid. Each subsequent version adds 4 modules per side, so version 2 is 25×25, version 3 is 29×29, and version 40 maxes out at 177×177 — that's 31,329 modules.

The generator automatically picks the smallest version that fits your data at the chosen error correction level. Here's how data capacity scales across selected versions:

Version	Modules	Numeric (EC-M)	Alphanumeric (EC-M)	Byte (EC-M)
1	21×21	34	20	14
2	25×25	63	38	26
5	37×37	202	122	84
10	57×57	652	395	271
20	97×97	2,061	1,249	858
30	137×137	4,158	2,520	1,732
40	177×177	5,313	3,220	2,210

Most real-world QR codes are between version 2 and version 7. A typical URL under 50 characters with level M error correction fits in version 3 (29×29 modules). You'd need to encode several paragraphs of text or a very long URL to push past version 10.

Higher versions aren't just bigger — they're harder to scan. A version 40 code has over 31,000 modules. If you print that at 2cm wide, each module is roughly 0.1mm. Most phone cameras can't resolve that. Keep your data short and your version number stays low. That's the single most effective thing you can do for scannability.

Data Masking

Raw encoded data can produce patterns that confuse scanners — large blocks of identical modules, patterns that mimic finder patterns, or uneven distributions of dark and light modules. To prevent this, the spec defines eight mask patterns (numbered 0–7). Each one is an XOR operation applied to every data module based on its row and column position.

The encoder applies all eight masks, scores each result using four penalty rules (adjacent same-color modules, 2×2 blocks, finder-like sequences, and overall dark/light ratio), and selects the mask with the lowest penalty score. The chosen mask number is stored in the format information bits.

This is entirely automatic — you never interact with masking directly. But it's the reason two QR codes encoding identical data can look different: different generators might break ties differently or apply optimizations that shift the mask selection.

The Encoding Pipeline

From raw text to scannable image, the full encoding process runs through these steps:

Mode analysis — determine the most efficient encoding mode (or mix of modes) for the input data
Data encoding — convert characters to bit streams using the selected mode's rules
Version selection — find the smallest version that fits the bit stream plus error correction overhead
Error correction — compute Reed-Solomon codewords and interleave them with data codewords
Module placement — lay out structural elements (finder patterns, timing patterns, alignment patterns) then fill data modules in the zigzag pattern
Masking — apply all eight masks, score each, select the best
Format and version encoding — write error correction level, mask number, and version info into their reserved positions

The whole process is deterministic. Given the same input data, error correction level, and encoder implementation, you get the same module grid every time.

Want to see this in action? Generate QR codes and experiment with different data lengths and error correction levels.

Try qrmcp.dev — Free QR Generator

Why QR Codes Beat Traditional Barcodes

One-dimensional barcodes (UPC, Code 128, EAN-13) encode data in a single row of varying-width bars. They work well for short identifiers — a 12-digit product code, a tracking number — but they hit a hard wall on capacity. A Code 128 barcode maxes out at roughly 20–25 characters before it becomes impractically long.

QR codes store data in two dimensions, which changes the math entirely. A version 5 QR code occupies a 37×37 module square and holds 84 bytes at error correction level M. To hold the same data in Code 128, you'd need a barcode over 15cm wide. The QR code fits in under 2cm.

There's also the orientation advantage. Barcodes must be scanned in one direction — the reader needs to cross the bars perpendicular to their alignment. QR codes scan from any angle, any rotation. The finder patterns handle that.

Practical Implications for Developers

If you're generating QR codes programmatically, a few technical details matter more than others:

Keep payloads under 100 bytes. This keeps the version low (under 7), the module count manageable, and the scan reliability high. If your data is longer, encode a short URL that redirects to it.
Use uppercase URLs when possible. Alphanumeric mode encodes at 5.5 bits per character versus 8 bits in byte mode. That's a 31% savings that directly reduces code size.
Don't default to error correction H. Level M handles most real-world conditions. H makes sense for logo overlays and industrial environments, not for a QR code on a clean flyer.
Respect the quiet zone. Four modules of white space on every side. Cropping into this margin is the number one cause of scan failures in print designs.
Test at target size. A code that scans perfectly on screen might fail when printed at 1.5cm on a business card. Print it, then scan the printout — not the screen.

For more on sizing, contrast, and testing workflows, see the QR code best practices guide.

Frequently Asked Questions

Why do QR codes have three squares in the corners?

The three large squares are finder patterns. They let the scanner detect the code's position, size, and rotation angle in a single pass. Three corners are enough to define a rectangle — the fourth corner is calculated mathematically. This design means QR codes scan correctly even when tilted, rotated 90°, or viewed at an angle.

How does Reed-Solomon error correction work in QR codes?

Reed-Solomon error correction adds redundant codewords to the data stream using polynomial math over a Galois field (GF(256)). When a scanner reads a damaged code, it uses these redundant codewords to reconstruct the missing or corrupted data — similar to how RAID arrays rebuild data from a failed disk. QR codes offer four error correction levels: L recovers 7%, M recovers 15%, Q recovers 25%, and H recovers 30% of damaged codewords.

What's the difference between QR code versions 1 and 40?

Version 1 is the smallest QR code — a 21×21 module grid that holds up to 41 numeric digits at error correction level L. Version 40 is the largest — a 177×177 module grid that holds up to 7,089 numeric digits. Each version increase adds 4 modules per side. Most real-world QR codes fall between version 2 and version 10, since they encode relatively short data like URLs.

Can QR codes store images or files?

Technically yes, but practically no. A version 40 QR code with error correction level L holds a maximum of 2,953 bytes — roughly 2.9 KB. That's far too small for any useful image or document. In practice, QR codes store a URL that points to the file instead. The code itself is just a pointer, not a container.