Glossary
Mode
The most frequent value
By Buğra SözeriPublished Updated
Mode is the most frequent value in a dataset. For [1, 2, 2, 3, 4, 4, 4, 5] the mode is 4 (appears three times; nothing else does). It’s the only measure of central tendency that works for non-numeric data — the mode of [“red”, “blue”, “red”, “green”, “red”] is “red.”
Three special cases:
- Unimodal — exactly one most-frequent value. The standard case.
- Bimodal — two values tie for most-frequent. Implies a mixed-population distribution (e.g. heights of adult men + women combined produce a bimodal distribution).
- No mode — every value appears exactly once. The mode is technically undefined; some conventions report “no mode,” others report every value as a mode.
Use the mode when: the data is categorical (colours, brands, types) or you specifically care about the most common value, not the central tendency. For numerical data with no repeating values (heights, salaries) the mode is unhelpful — use mean or median.
Our statistics calculator reports the mode alongside mean and median, handling the bimodal case by listing all most-frequent values.
Why bimodal distributions are a diagnostic, not just a curiosity: when a histogram of continuous data shows two distinct peaks, it almost always means the dataset is a mixture of two underlying populations. The classic example is heights of adult humans — pooling men and women produces a bimodal curve; splitting by sex produces two clean unimodal curves. Bimodality in customer-spend distributions usually means a free-tier and a paid-tier population mixed together. Bimodality in response-time distributions often means a fast-path and a slow-path (cache hit vs cache miss) need separate treatment. Reporting the overall mean on a bimodal distribution is rarely useful — fit a mixture model or split the segments first.
Mode for continuous data — kernel density estimation: in a continuous dataset where exact value repetition is rare, the “mode” is usually defined as the peak of the kernel density estimate (KDE) rather than the most-frequent raw value. Statistical packages (R’s density(), Python’s scipy gaussian_kde) compute KDEs by default for this purpose. The bandwidth parameter — how wide each data point’s contribution to the density spreads — is the main lever, and Silverman’s rule of thumb works well for unimodal data. For bimodal data, choose bandwidth small enough that the two peaks remain resolved. Related: mean, median. Reference: NIST/SEMATECH e-Handbook — Measures of Central Tendency.
Worked example
A clothing retailer records sizes sold across 200 transactions: {XS: 12, S: 38, M: 64, L: 51, XL: 28, XXL: 7}. Mode = M (64 occurrences). The mode is the only statistic that meaningfully summarises this column — “mean size” or “median size” require encoding XS-XXL as numbers, and any encoding chosen is arbitrary. Now imagine the same retailer adds a children’s line and pools data: the new sizes histogram is {2T: 30, 4T: 28, 6: 22, S: 38, M: 64, L: 51, XL: 28}. The distribution is bimodal (peak at 2T-6 for children, peak at M-L for adults), and the “most common size sold” (M) actively misleads any decision about kids’-line inventory. Segmenting before computing the mode — once for children, once for adults — recovers the right inventory signal: 2T is the modal children’s size, M is the modal adult size.
When and why it matters
Modes drive operational decisions in retail (which size/colour/SKU to stock most heavily), elections (the modal candidate wins a plurality system, even without a majority), recommendation systems (most-viewed item per category), and natural-language analysis (modal word/n-gram in a corpus reveals topic). The trap is assuming a single mode exists when the underlying population is mixed. Survey researchers, A/B testers, and product analysts hit this constantly: any dataset that pools users across segments (geography, plan tier, device type) often shows bimodality that disappears when you facet. The defensive habit: always plot the histogram before reporting any single “central tendency” number. Reference: NIST/SEMATECH e-Handbook — Histogram Interpretation: Bimodal.
Frequently asked questions
- What is the mode?
- The mode is the value that appears most frequently in a dataset. For [1, 2, 2, 3, 4], the mode is 2. A dataset can be unimodal (one mode), bimodal (two modes), or multimodal (multiple peaks).
- When is the mode useful?
- The mode is most useful for categorical data — the most common shoe size sold, the most popular support ticket category, the most frequent colour ordered. It is the only average that applies to nominal (non-numeric) data.
- What is the difference between mode, mean, and median?
- Mean is the arithmetic average, sensitive to outliers. Median is the middle value, robust to outliers. Mode is the most frequent value, useful for discrete or categorical data. For a symmetric distribution like a bell curve, all three are equal.
- What does it mean for a distribution to be bimodal?
- A bimodal distribution has two distinct peaks in its frequency plot, meaning two values (or ranges) are especially common. It often indicates two subgroups in the data — for example, a height dataset that mixes adult males and females.
Related
Published May 16, 2026 · Last reviewed May 31, 2026