Glossary
Percentile
A relative-ranking statistic
By Buğra SözeriPublished Updated
A percentile is the value below which a given percentage of observations in a dataset fall. The 90th percentile is the value beating 90% of the data; the median is the 50th percentile.
Percentiles are useful when the distribution isn’t normal — which is most of the time. Mean and standard deviation describe a normal distribution well; percentiles describe any distribution by reporting where its mass sits. Income distribution, latency measurements, and test scores are all commonly reported via percentiles for this reason.
Computing a percentile is straightforward in concept, fiddly in practice. The naive “the value at rank p × n in the sorted data” approach works when p × n is an integer; otherwise you have to interpolate. The most common method is linear interpolation between the two closest ranks, which is the default in NumPy (linear) and the NIST reference. R supports nine different percentile algorithms via the type parameter — they typically disagree by less than half a percentile point.
Quartiles are the 25th, 50th, and 75th percentiles (Q1, Q2 = median, Q3). The interquartile range (IQR) = Q3 − Q1 is a robust measure of spread — robust because it ignores the outer 25% of data on each end, where outliers do their damage.
Use our statistics calculator for any percentile against a pasted dataset.
Percentile vs percentile rank — the easy confusion: a percentileis a value from the dataset (the 90th percentile of test scores is “the score that 90% of students fall below”, e.g. 87). A percentile rankis the inverse — given a value, what fraction of the data sits below it (Alice scored 87 → her percentile rank is 90). Standardised test reports (SAT, GMAT, GRE) almost always report percentile rank rather than the raw percentile of the score, which is the more useful number for the test-taker. The two are related but the distinction matters when reading published distributions.
Why p99 and p99.9 became service-level standards: for web services, the 99th and 99.9th percentile of latency capture the experience of users hitting the slowest paths. A service averaging 100 ms but with a p99 of 5 s feels broken to the 1% of requests that hit the tail. SLOs (Service Level Objectives) are typically expressed as “99% of requests under 200 ms over a 28-day window” — the Google SRE convention — because users notice tail latency more than average latency. Reference: NIST/SEMATECH e-Handbook — Percentile.
Worked example
Compute the 90th percentile of 20 response times (ms): [12, 15, 18, 19, 22, 24, 25, 28, 30, 33, 36, 40, 44, 48, 55, 62, 75, 90, 140, 410] (already sorted). Using the NumPy linear method: the position is (n−1) × p = 19 × 0.9 = 17.1. Interpolate between the 18th and 19th values (zero-indexed: indices 17 and 18 = 90 and 140): 90 + 0.1 × (140 − 90) = 95 ms. So p90 = 95 ms. The arithmetic mean of the same data is 64.8 ms — pulled up by the 410 ms outlier — and the median (p50) is 34.5 ms. Reporting these three together (p50 = 34.5, p90 = 95, max = 410) tells a coherent latency story; reporting only the mean would mask the long tail entirely.
When and why it matters
Percentile-based SLOs are the operational language of modern SRE because mean-based SLOs lie about user experience. A service with mean 100 ms and p99 of 200 ms is a different product from a service with mean 100 ms and p99 of 5 seconds, but both report the same average. A latency budget written as “p99 < 300 ms over 28 days” tells you what to alert on and what to optimise. Outside SRE: standardised testing (a child’s “75th percentile for height” on a paediatric growth chart), economic policy (income-share-of-the-top-1% is a percentile statement), and ML evaluation (per-percentile error rates expose worst-case-fairness issues invisible to the mean). The defensive habit when consuming any “average” number is to ask for the distribution or at least p50/p90/p99. Reference: Google SRE Book — Service Level Objectives.
Frequently asked questions
- What is a percentile?
- A percentile is a value below which a given percentage of observations in a dataset fall. The 90th percentile of a dataset means 90% of values are below that point.
- How are percentiles used in practice?
- Paediatric growth charts express height and weight as percentiles so parents can see how a child compares to a reference population. Server performance is typically reported as the 95th or 99th percentile latency to capture worst-case behaviour that averages hide.
- What is the difference between a percentile and a percentage?
- A percentage is a ratio expressed out of 100 (e.g. a score of 80%). A percentile is a rank position within a distribution: scoring at the 80th percentile means you outperformed 80% of the group, regardless of the absolute score.
Related
Published May 14, 2026 · Last reviewed May 31, 2026