To the uninitiated, a QR code looks like digital noise. A random scattering of black squares on a white canvas. But within that apparent chaos lies a highly structured, fault-tolerant system of data storage. This guide explains exactly how QR codes work, from the binary encoding of data to the error correction algorithms that make them readable even when damaged.
A QR (Quick Response) code is a two-dimensional matrix barcode. Unlike traditional barcodes that store data in a single row of lines, QR codes store data both horizontally and vertically. This allows them to hold significantly more information in a smaller space.
The technology was invented in 1994 by Masahiro Hara at Denso Wave, a Toyota subsidiary, to track automotive parts during manufacturing. The design prioritised three things: high data capacity, fast scanning, and resilience to damage. Thirty years later, those same properties make QR codes ideal for everything from restaurant menus to cryptocurrency wallets.
1 Anatomy of a QR Code
Every QR code is built on a strict grid of black and white squares called modules. The arrangement of these modules is not random. Specific regions serve specific functions, and understanding them reveals how the technology works.
Finder Patterns
The three large squares in the corners are finder patterns. Each consists of a 7x7 module square with a specific structure: a 3x3 black centre, surrounded by a white ring, surrounded by a black ring. This 1:1:3:1:1 ratio of black-white-black-white-black is unique and allows scanners to locate the code instantly.
The finder patterns serve two purposes. First, they tell the scanner "this is a QR code." Second, they establish orientation. Because there are three finder patterns (top-left, top-right, bottom-left) but not a fourth (bottom-right), the scanner can determine which way is up regardless of how the code is rotated.
Why Three Corners?
The fourth corner is intentionally left empty. This asymmetry lets scanners determine the code's orientation in any rotation. If all four corners had finder patterns, the scanner could not tell which way was up.
Alignment Patterns
Larger QR codes (version 2 and above) contain smaller squares called alignment patterns. These are 5x5 modules with a single black module in the centre, a white ring, and a black ring. They help scanners correct for distortion when codes are printed on curved surfaces or photographed at an angle.
The number of alignment patterns increases with the QR code version. A version 2 code has one alignment pattern. A version 40 code (the largest) has 46 alignment patterns arranged in a grid.
Timing Patterns
Two lines of alternating black and white modules connect the finder patterns. One runs horizontally between the top-left and top-right finders. The other runs vertically between the top-left and bottom-left finders. These timing patterns establish the grid coordinate system, allowing the scanner to determine the exact position of every module.
Format Information
Adjacent to each finder pattern is a 15-bit format information string. This contains two pieces of critical data: the error correction level (L, M, Q, or H) and the mask pattern number (0-7). The format information is duplicated in two locations for redundancy and is itself protected by error correction.
Version Information
QR codes version 7 and above include an 18-bit version information block. This appears in two locations near the finder patterns and tells the scanner which version (size) of QR code it is decoding. Like format information, it includes error correction bits.
Data and Error Correction Modules
The remaining space contains the actual encoded data interleaved with error correction codewords. This is where your URL, text, or other content lives, along with the redundant information that allows damaged codes to still be read.
Quiet Zone
Every QR code requires a margin of empty space around it called the quiet zone. The specification requires at least 4 modules of white space on all sides. This margin helps scanners distinguish where the code ends and the surrounding content begins. Cropping into the quiet zone is a common cause of scanning failures.
| Component | Size | Purpose |
|---|---|---|
| Finder Pattern | 7x7 modules | Location and orientation detection |
| Alignment Pattern | 5x5 modules | Distortion correction |
| Timing Pattern | 1 module wide | Grid coordinate system |
| Format Information | 15 bits (x2) | Error level and mask pattern |
| Version Information | 18 bits (x2) | QR code version (v7+) |
| Quiet Zone | 4+ modules | Boundary separation |
2 QR Code Versions and Capacity
QR codes come in 40 versions, numbered 1 to 40. Each version adds 4 modules to each side. Version 1 is 21x21 modules. Version 40 is 177x177 modules. Larger versions hold more data but require more space to print at a scannable size.
The capacity depends on both the version and the error correction level. Higher error correction means more redundant data and less room for your content. Here are the maximum capacities for common versions:
| Version | Modules | Numeric (L) | Alphanumeric (L) | Bytes (L) |
|---|---|---|---|---|
| 1 | 21x21 | 41 | 25 | 17 |
| 5 | 37x37 | 202 | 122 | 84 |
| 10 | 57x57 | 652 | 395 | 271 |
| 20 | 97x97 | 2,061 | 1,249 | 858 |
| 40 | 177x177 | 7,089 | 4,296 | 2,953 |
Capacities shown for error correction level L (lowest). Higher levels reduce capacity.
Most real-world QR codes are between version 1 and version 10. A typical URL fits comfortably in version 3 or 4. Only specialised applications like encoding entire documents or detailed contact cards require larger versions.
Practical Implication
Shorter content creates simpler QR codes. A 20-character URL produces a cleaner, more scannable code than a 200-character URL. This is why URL shorteners and dynamic QR codes (which use short redirect URLs) often scan more reliably.
3 Data Encoding Modes
QR codes support four encoding modes, each optimised for different types of content. The mode determines how efficiently data is packed into the available space.
Numeric Mode
The most efficient mode, encoding only digits 0-9. Three digits are packed into 10 bits. This allows version 1 to hold 41 numeric characters compared to just 17 bytes of arbitrary data. Numeric mode is ideal for phone numbers, identification codes, or any purely numeric content.
Alphanumeric Mode
Supports 45 characters: digits 0-9, uppercase letters A-Z, and nine symbols: space, $, %, *, +, -, ., /, and colon. Two characters are packed into 11 bits. Note that lowercase letters are not supported in alphanumeric mode. URLs encoded in alphanumeric mode must be uppercase.
Byte Mode
The default mode for most content. Each character uses 8 bits. Byte mode supports the ISO-8859-1 character set, which includes lowercase letters, accented characters, and common symbols. Most QR codes containing URLs use byte mode because URLs typically contain lowercase letters.
Kanji Mode
Designed for Japanese text, encoding Shift JIS characters. Each character uses 13 bits. This mode is significantly more efficient than encoding Japanese text as UTF-8 bytes, which would require 3 bytes (24 bits) per character.
| Mode | Characters | Bits per Character | Mode Indicator |
|---|---|---|---|
| Numeric | 0-9 | 3.33 (10 bits per 3 chars) | 0001 |
| Alphanumeric | 0-9, A-Z, 9 symbols | 5.5 (11 bits per 2 chars) | 0010 |
| Byte | ISO-8859-1 | 8 | 0100 |
| Kanji | Shift JIS | 13 | 1000 |
Mixed Mode Encoding
A single QR code can switch between modes. If your content contains a long string of digits followed by text, the encoder can use numeric mode for the digits and byte mode for the text. Each mode switch adds overhead (a 4-bit mode indicator plus a character count), so encoders optimise for the smallest total size.
4 The Encoding Process
Converting data into a QR code involves several steps. Understanding this process reveals why certain design choices matter.
Step 1: Data Analysis
The encoder analyses the input data to determine the optimal encoding mode (or combination of modes). It calculates how many characters need to be encoded and selects the smallest QR code version that can hold the data at the chosen error correction level.
Step 2: Data Encoding
The data is converted to a binary string according to the selected mode. For byte mode, each character becomes its 8-bit ASCII value. The binary string is prefixed with a mode indicator and character count, then padded to fill the available data capacity.
Step 3: Error Correction Coding
The data codewords are processed through the Reed-Solomon algorithm to generate error correction codewords. These are interleaved with the data codewords to create the final message. The interleaving distributes data across the code so that localised damage affects multiple codewords partially rather than destroying a few completely.
Step 4: Module Placement
The combined data and error correction bits are placed into the QR code matrix following a specific pattern. The placement starts from the bottom-right corner and proceeds upward in a zigzag pattern, skipping over the function patterns (finders, timing, etc.).
Step 5: Masking
A mask pattern is applied to balance the distribution of black and white modules. The encoder tries all eight mask patterns and selects the one with the lowest penalty score (explained in the masking section below).
Step 6: Format and Version Information
The format information (error correction level and mask pattern) is encoded with its own error correction and placed in the reserved areas. For version 7 and above, version information is similarly encoded and placed.
5 Error Correction
Error correction is what makes QR codes robust. A QR code can be read even when partially obscured, damaged, or printed imperfectly. This is possible because of Reed-Solomon error correction, the same algorithm used in CDs, DVDs, and deep-space communications.
Reed-Solomon Algorithm
Reed-Solomon works by treating data as coefficients of a polynomial and adding redundant symbols calculated from that polynomial. When errors occur, the decoder can identify both the location and value of corrupted symbols, then reconstruct the original data.
The mathematics are complex, but the practical result is straightforward: for every two error correction codewords, the algorithm can correct one corrupted codeword. Higher error correction levels add more redundant codewords, enabling recovery from more damage.
Error Correction Levels
| Level | Recovery Capacity | Data Overhead | Best For |
|---|---|---|---|
| L (Low) | ~7% of codewords | ~20% | Clean environments, digital displays |
| M (Medium) | ~15% of codewords | ~38% | General use, standard printing |
| Q (Quartile) | ~25% of codewords | ~55% | Industrial settings, potential wear |
| H (High) | ~30% of codewords | ~65% | Adding logos, harsh environments |
The trade-off is straightforward: higher error correction means less room for data. A version 10 code with level L can hold 271 bytes. The same version with level H holds only 119 bytes. Choose based on your use case.
Adding Logos
If you plan to overlay a logo on your QR code, always use error correction level H. The logo covers data modules, and you need the maximum recovery capacity to compensate. Even then, keep the logo under 30% of the code area and test thoroughly.
6 Masking Patterns
After data is placed in the QR code matrix, a mask pattern is applied. Masking exists to prevent problematic patterns that could confuse scanners.
Why Masking Is Necessary
Without masking, the data might create large blocks of solid colour or patterns that resemble finder patterns. A large white area could be mistaken for a quiet zone. A pattern similar to a finder pattern could confuse the scanner's orientation detection. Masking breaks up these problematic patterns.
The Eight Mask Patterns
The QR specification defines eight mask patterns, numbered 0-7. Each is defined by a mathematical formula that determines whether a module at position (row, column) should be inverted:
| Pattern | Formula (invert if true) | Visual Effect |
|---|---|---|
| 0 | (row + column) mod 2 == 0 | Checkerboard |
| 1 | row mod 2 == 0 | Horizontal stripes |
| 2 | column mod 3 == 0 | Vertical stripes (every 3rd) |
| 3 | (row + column) mod 3 == 0 | Diagonal stripes |
| 4 | (row/2 + column/3) mod 2 == 0 | Large checkerboard |
| 5 | (row*column) mod 2 + (row*column) mod 3 == 0 | Cross pattern |
| 6 | ((row*column) mod 2 + (row*column) mod 3) mod 2 == 0 | Diamond pattern |
| 7 | ((row+column) mod 2 + (row*column) mod 3) mod 2 == 0 | Complex pattern |
Penalty Scoring
The encoder applies each mask pattern and calculates a penalty score based on four rules:
- Consecutive modules: Penalty for five or more same-colour modules in a row or column
- Block patterns: Penalty for 2x2 blocks of the same colour
- Finder-like patterns: Penalty for patterns resembling finder patterns (1:1:3:1:1 ratio)
- Colour balance: Penalty if the ratio of dark to light modules is far from 50%
The mask with the lowest total penalty is selected. This ensures the final QR code has good contrast distribution and avoids confusing patterns.
7 How Scanning Works
When your phone camera points at a QR code, it performs a sophisticated sequence of operations in milliseconds. Understanding this process explains why some codes scan better than others.
Step 1: Detection
The scanner searches the image for finder patterns. It looks for the distinctive 1:1:3:1:1 ratio of black and white that uniquely identifies QR code corners. Modern scanners can detect finder patterns at various scales and rotations simultaneously.
Step 2: Orientation
Once three finder patterns are located, the scanner calculates the code's orientation. The position of the three patterns (and the absence of a fourth) determines which corner is which. The scanner virtually rotates the image so the code is right-side-up.
Step 3: Perspective Correction
If the code was photographed at an angle, the scanner applies a perspective transform to correct the distortion. Alignment patterns (in larger codes) help refine this correction. The result is a square, properly oriented view of the code.
Step 4: Grid Sampling
Using the timing patterns as guides, the scanner creates a grid over the code and samples the centre of each module position. Each sample is classified as black or white based on a threshold calculated from the image's brightness distribution.
Step 5: Format Information
The scanner reads the format information to determine the error correction level and mask pattern. This information is protected by its own error correction, so it can be recovered even if partially damaged.
Step 6: Unmasking
The scanner applies the inverse of the identified mask pattern. This reverses the XOR operation applied during encoding, revealing the original data and error correction modules.
Step 7: Data Extraction
Following the zigzag placement pattern in reverse, the scanner reads the data and error correction codewords. The codewords are de-interleaved into their original blocks.
Step 8: Error Correction
The Reed-Solomon decoder checks each block for errors. If errors are found within the correction capacity, they are corrected. If too many errors exist, the decode fails.
Step 9: Data Decoding
The corrected data stream is parsed according to the mode indicators. The mode indicator specifies how to interpret the following bits. The character count tells the decoder how many characters to extract. The final result is the original encoded content.
Speed of Scanning
This entire process typically completes in under 100 milliseconds. Modern smartphone cameras can detect and decode multiple QR codes simultaneously, though they usually focus on the most prominent one.
8 Why QR Codes Are So Reliable
The QR code specification includes multiple layers of redundancy and error tolerance. This is not accidental. The technology was designed for factory floors where codes might be dirty, damaged, or poorly printed.
Orientation Independence
The three-corner finder pattern design means QR codes scan correctly regardless of rotation. You can hold your phone at any angle. This removes a major failure point that plagued earlier barcode technologies.
Distortion Tolerance
Alignment patterns and perspective correction algorithms allow QR codes to be read from steep angles. A code printed on a curved bottle or photographed from the side can still be decoded successfully.
Damage Resistance
Reed-Solomon error correction can recover data even when 30% of the code is unreadable. This handles scratches, stains, partial coverage, and printing defects. The interleaving of data across the code means localised damage is distributed across multiple codewords.
High Contrast Requirement
The binary nature of QR codes (black or white, no grey) makes them resistant to brightness variations. As long as there is sufficient contrast between the foreground and background, the scanner can reliably distinguish modules.
Standardisation
The ISO/IEC 18004 specification defines every aspect of QR code encoding and decoding. This standardisation ensures that any compliant encoder produces codes that any compliant decoder can read. There are no compatibility issues between different QR code generators or scanners.
What Makes Codes Scan Well
- High contrast (dark on light)
- Intact quiet zone
- Appropriate size for distance
- Clean, sharp printing
- Shorter data content
Common Scanning Problems
- Low contrast colours
- Cropped quiet zone
- Code too small for distance
- Blurry or pixelated printing
- Oversized logo covering data
Summary
A QR code is not random noise. It is a precisely engineered data structure with multiple redundancy mechanisms. The finder patterns establish location and orientation. The timing patterns create a coordinate grid. The encoding modes optimise data density. The Reed-Solomon algorithm recovers from errors. The masking patterns ensure good contrast distribution.
Understanding these mechanisms explains practical observations. Shorter URLs create simpler codes because less data means fewer modules. Adding logos requires high error correction because you are deliberately damaging data. Dark colours on light backgrounds work best because scanners need clear contrast. The quiet zone matters because scanners need to find the code boundaries.
The technology is over 30 years old but remains relevant because it was well-designed from the start. The original goals of high capacity, fast scanning, and damage resistance align perfectly with modern use cases. From factory floors to restaurant menus, the fundamental engineering has proven remarkably durable.
Create Your Own QR Code
Now that you understand the technology, put it to use. LinkScan generates QR codes with customisable error correction, colours, and styles.
Create QR CodeFurther Reading
- QR Code Size Guide for print and digital applications
- Static vs Dynamic QR Codes for choosing the right type
- History of QR Codes from 1994 to today
- ISO/IEC 18004 the official QR code specification