Glossary
Chroma subsampling
The compression trick JPG and most video codecs use
By Buğra SözeriPublished Updated
Chroma subsampling is a compression technique that stores colour (chroma) at lower resolution than brightness (luma). It exploits a well-documented quirk of human vision: we’re far more sensitive to brightness contrast than to colour contrast at small scales.
The notation looks like 4:2:0, 4:2:2, or 4:4:4. The three numbers describe a 4-pixel-wide reference block: how many luma samples (always 4), how many chroma samples in the first row, how many in the second row.
- 4:4:4 — no subsampling. Full colour resolution. Used for graphics, screen captures, anywhere edges and text matter.
- 4:2:2 — chroma at half horizontal resolution. Used in professional video editing.
- 4:2:0 — chroma at quarter resolution (half horizontal, half vertical). Used by JPG, MPEG, H.264, H.265, most consumer video. Saves ~50% of the chroma data with almost no perceived quality loss for photographs.
Where 4:2:0 fails: sharp colour edges, especially text. Saturated red text on a saturated blue background gets visibly fuzzy. This is why screenshots should be PNG (no subsampling) and photographs can be JPG (4:2:0 invisible).
Worked example
A 1920×1080 RGB image stores 1920 × 1080 × 3 bytes = 6,220,800 bytes (~6 MB raw). Convert to YCbCr 4:4:4 and the size is identical — 3 channels at full resolution. Convert to 4:2:2 and the chroma channels (Cb and Cr) drop to 960×1080 each: total bytes = 1920·1080 (Y) + 960·1080·2 (Cb+Cr) = 2,073,600 + 2,073,600 = 4,147,200, a 33% reduction in raw plane data. Convert to 4:2:0 and the chroma planes drop to 960×540 each: total = 2,073,600 + 1,036,800 = 3,110,400, exactly 50% of the original. JPG, H.264, H.265, AVIF, and WebP all use 4:2:0 by default, which is why a JPG “saves 50%” before any DCT compression even runs — the chroma subsampling provides that baseline for free.
Modern encoders sometimes adapt subsampling per-region: AVIF and JPEG XL can encode some image regions at 4:4:4 and others at 4:2:0 within a single file, trading a small overhead for sharp text on photographic backgrounds. This per-tile flexibility is one of the structural improvements over JPEG’s single-format-per-file constraint.
When and why it matters
Subsampling matters whenever a workflow includes sharp colour edges that 4:2:0 cannot represent — screenshots of code or terminal text, line art with saturated colours, logos with pure red on pure blue, vector exports rasterised for archive, and any image where pixel-accurate colour at edges is non-negotiable. The fix is to pick a format that defaults to (or supports) 4:4:4: PNG (no subsampling at all), AVIF with the encoder forced into 4:4:4 mode (--yuv=444 in libavifenc), JPEG XL, or modern WebP lossless. The opposite mistake — using PNG for a 12-megapixel photograph because “PNG is higher quality” — wastes 80%+ of the file size on chroma information the viewer cannot perceive. The professional rule of thumb: photographs → JPG/WebP/AVIF 4:2:0; UI screenshots → PNG or AVIF 4:4:4; mixed content → test both and inspect the result at 200% zoom around any text. Reference: Chroma subsampling — formats and notation.
Why human vision lets us get away with this: the retina has roughly 120 million rod cells (sensitive to brightness, no colour information) and only 6 million cone cells (responsible for colour). The 20-to-1 ratio is the biological reason chroma subsampling works — losing every other chroma sample is invisible to most viewers, while losing every other luma sample produces obvious posterisation. The same principle underlies the YCbCr colour space used by JPG and every video codec: separate the channel that matters most (Y, luma) from the two that matter less (Cb, Cr, chroma) so each can be sampled differently.
The terminal-screen counter-example: programmers viewing 4:2:0 H.264 screencasts of code regularly complain that text looks soft. The fix is either to upgrade to a 4:4:4-capable codec (FFV1, HuffYUV losslessly, or H.264 in 4:4:4 mode — supported by Chrome and OBS but not by YouTube’s standard transcode) or to record at higher resolution so that downscaling at playback hides the chroma artefacts. For non-text content (gameplay, talking-head video, animation), 4:2:0 is universally fine. Related: sRGB, gamma, WebP. Reference: ITU-T T.871 — JPEG file interchange format (JFIF).
Frequently asked questions
- What is chroma subsampling?
- Chroma subsampling reduces the resolution of colour (chroma) channels while keeping luma (brightness) at full resolution. It exploits the human visual system's higher sensitivity to brightness than to colour. The most common scheme is 4:2:0, which stores colour at one quarter the resolution of brightness.
- How does chroma subsampling work in JPEG?
- A JPEG encoder converts RGB to YCbCr, then typically discards every other sample of the Cb (blue-difference) and Cr (red-difference) channels both horizontally and vertically (4:2:0). This halves colour data with minimal perceived quality loss for photographs.
- What is the difference between 4:4:4 and 4:2:0 chroma subsampling?
- 4:4:4 stores full colour at every pixel — no information is discarded. 4:2:0 stores one colour sample per 2×2 pixel block, reducing colour data by 75%. For text, fine colour gradients, or screen recordings, 4:4:4 is visually superior; for natural photos and video, 4:2:0 is usually sufficient.
- When does chroma subsampling cause visible quality loss?
- Chroma subsampling causes visible artefacts on sharp colour edges — coloured text on a white background, red logos, or green-screen keying. Video editing and broadcast workflows often specify 4:2:2 or 4:4:4 to avoid these artefacts during post-processing.
Related
Published May 14, 2026 · Last reviewed May 31, 2026