Methodology
Math methodology
Percentage, area, and statistics — the formulas and the conventions.
The Math cluster ships three tools today — percentage, area, and descriptive statistics. The math is grade-school algebra; the value is in the conventions (which formula for variance, which interpolation for percentile, what to do when inputs are degenerate) and in getting those right.
Percentage — three formulas, one tool
The percentage calculator covers the three questions that account for ~95% of percent queries:
- X% of Y— the “find a portion” form.
result = (X / 100) × Y. - X is what % of Y— the “ratio as percent” form.
result = (X / Y) × 100. Returns null if Y is zero. - Percent change from X to Y — signed delta.
result = ((Y − X) / X) × 100. Returns null if X is zero (division by zero, not Infinity).
The third one is where the most confusion happens. Percent change is signed and uses the starting value as the base. Percent difference (used in some scientific contexts) uses the average of the two values as the base and is unsigned. Our tool computes percent change.
Area — Heron’s formula for triangles
The area calculator covers eight shapes. Seven are direct algebra:
- Rectangle:
A = w × h - Square:
A = s² - Circle:
A = π · r² - Triangle (base × height):
A = ½ · b · h - Trapezoid:
A = ½ · (a + b) · h - Ellipse:
A = π · a · b - Regular n-gon:
A = (n · s²) / (4 · tan(π / n))
The eighth is the triangle-from-three-sides form, which uses Heron’s formula:
A = √(s(s − a)(s − b)(s − c))·where s = (a + b + c) / 2
Heron’s formula is one of the oldest results in elementary geometry — Hero of Alexandria published it in the 1st century CE. It computes triangle area from three side lengths alone, with no need for a height. If the three sides violate the triangle inequality (any side ≥ sum of the other two), the quantity under the square root is negative and we return 0 rather than NaN.
Regular polygon formula derivation
Split a regular n-gon with side length s into n isoceles triangles, each with apex at the centre. The apex angle of each is 2π / nradians. Each triangle’s base is s; its height (the apothem) is s / (2 · tan(π / n)). Triangle area is therefore s² / (4 · tan(π / n)), and the polygon area is n times that.
Statistics — sample vs population
The statistics calculator returns mean, median, mode, variance, standard deviation, range, and quartile cuts for any user-pasted dataset. Two decisions matter:
Variance: sample (n−1) vs population (n)
The textbook formula for population variance is:
σ² = Σ(x − μ)² / nFor a sample drawn from a larger population, the sample mean is closer to the data than the true population mean would be — so the sum of squared deviations under-estimates the true variance. Bessel’s correction divides by n − 1 instead of n to remove this bias:
s² = Σ(x − x̄)² / (n − 1)Our default is the sample form (with the correction) because most users paste samples, not exhaustive enumerations. A toggle in the UI switches to the population form when needed. At large n the difference is negligible; at small n it matters meaningfully.
Percentile: NIST linear interpolation
Percentile is ambiguous — there are at least nine documented algorithms (R uses them all under different `type` parameters). We use the simplest defensible one: linear interpolation between the two closest ranks. The 50th percentile equals the median; 0th equals min; 100th equals max. The 25th percentile of [1, 2, …, 10] is 3.25, sitting one-quarter of the way between rank 3 (value 3) and rank 4 (value 4).
This is the NIST default, NumPy’s default (`linear` mode), and R’s type 7 default. It’s continuous — small data changes produce small percentile changes — which is what you want for visualisations and dashboards.
Mode handling
Mode is the value (or values) with the highest frequency. We return alltied-most-frequent values, sorted, so a bi-modal dataset like [1, 1, 2, 2, 3] returns mode [1, 2] rather than picking one arbitrarily. If every value in the dataset appears exactly once, there is no mode by definition and we return an empty array (displayed as “—”).
Precision and edge cases
- Empty input.All summary statistics return NaN; mode returns the empty array. The UI shows “—” for any NaN value.
- Single-value sample variance. The n−1 divisor produces division by zero. We return NaN rather than Infinity.
- Non-numeric tokens in parser.Stripped silently. Pasting “1, 2, banana, 3” produces a three-value dataset.
Related
Published May 14, 2026