Correlation (Pearson's r) measures the linear relationship between two variables on a scale of −1 to +1. A value of +1 means a perfect positive linear relationship, −1 means a perfect negative linear relationship, and 0 means no linear relationship.

How is correlation used in practice?

A finance analyst finds that two stocks have a correlation of r = 0.85 — they move together strongly. Adding the second stock to a portfolio containing the first provides little diversification benefit; a stock with r = −0.3 would provide much more.

What is the difference between correlation and causation?

Correlation only measures statistical co-movement, not cause and effect. Ice cream sales and drowning rates are strongly correlated because both rise in summer; ice cream does not cause drowning. Establishing causation requires controlled experiments or causal inference methods.

What is the difference between Pearson and Spearman correlation?

Pearson's r measures linear relationships and requires roughly normally distributed continuous data. Spearman's ρ (rho) ranks the data first and measures monotonic relationships, making it robust to outliers and appropriate for ordinal data like survey ratings.

Glossary

Correlation

How tightly two variables move together

By Buğra SözeriPublished May 16, 2026Updated May 31, 2026

Correlation measures the degree to which two variables move together. The standard measure is Pearson’s r: a single number from −1 to +1 where +1 means perfect positive linear relationship, 0 means no linear relationship, and −1 means perfect negative linear relationship.

Practical interpretation:

|r| < 0.3 — weak
0.3 ≤ |r| < 0.7 — moderate
|r| ≥ 0.7 — strong

Three things every reader of correlation numbers should know:

Pearson’s r only captures linear relationships. Two variables related by a perfect quadratic (y = x²) can have r ≈ 0 if x ranges over both positive and negative values. For non-linear relationships, Spearman’s rho is the more robust alternative.
Correlation is not causation. Two variables can correlate strongly because A causes B, B causes A, both are caused by a third variable, or pure coincidence (especially in small samples or comparing many pairs).
Outliers distort r dramatically. A single outlier in a small dataset can flip the sign of the correlation. Always plot the data before trusting the number.

For categorical or rank-ordered data, use Spearman’s rank correlation instead of Pearson. For binary outcomes, look up the phi coefficient. For nominal categorical data with more than two levels, Cramér’s V.

Anscombe’s quartet — the famous illustration: in 1973, statistician Francis Anscombe constructed four small datasets that all share the same mean, variance, correlation coefficient (0.816), and linear-regression line — yet look completely different when plotted. One is a clean linear trend; one is a perfect curve; one is a line with a single outlier; one is a vertical line with one rogue point. The quartet is still cited as the canonical case for “always plot the data first.” The Datasaurus Dozen (Matejka & Fitzmaurice, 2017) extends the same idea to twelve datasets sharing summary statistics — including one shaped like a dinosaur. Both make the same point: a single correlation number is necessary but never sufficient. Reference: NIST/SEMATECH e-Handbook — Linear Correlation.

Worked example

Five data points (1,2), (2,4), (3,5), (4,4), (5,5). Means x̄ = 3, ȳ = 4. Deviations x − x̄: −2, −1, 0, 1, 2. Deviations y − ȳ: −2, 0, 1, 0, 1. Sum of cross-products Σ(xᵢ − x̄)(yᵢ − ȳ) = 4 + 0 + 0 + 0 + 2 = 6. Sum of squared x deviations: 10; of y deviations: 6. Pearson r = 6 / √(10 × 6) = 6 / 7.746 ≈ 0.775 — a strong positive linear relationship. A scatter plot would show that interpretation holds; if the third point were (3, 50) instead of (3, 5), r would still appear well-defined but the linear model would be dominated by a single outlier.

When correlation drives decisions

Portfolio diversification: assets with low pairwise correlation reduce overall variance even when their individual volatilities are high. The 2008 financial crisis showed the catastrophic counterexample — equities, corporate bonds, REITs, and even gold all moved together when liquidity dried up, and correlation matrices estimated from calm markets understated tail risk. In ML feature engineering, two features with r > 0.95 are effectively redundant; dropping one rarely degrades model accuracy and speeds training. For experimentation, treating correlated metrics as independent inflates the false-positive rate — apply Bonferroni or Benjamini-Hochberg corrections. Related: regression, variance. Background: Pearson correlation coefficient (Wikipedia).

Frequently asked questions

What is correlation?: Correlation (Pearson's r) measures the linear relationship between two variables on a scale of −1 to +1. A value of +1 means a perfect positive linear relationship, −1 means a perfect negative linear relationship, and 0 means no linear relationship.
How is correlation used in practice?: A finance analyst finds that two stocks have a correlation of r = 0.85 — they move together strongly. Adding the second stock to a portfolio containing the first provides little diversification benefit; a stock with r = −0.3 would provide much more.
What is the difference between correlation and causation?: Correlation only measures statistical co-movement, not cause and effect. Ice cream sales and drowning rates are strongly correlated because both rise in summer; ice cream does not cause drowning. Establishing causation requires controlled experiments or causal inference methods.
What is the difference between Pearson and Spearman correlation?: Pearson's r measures linear relationships and requires roughly normally distributed continuous data. Spearman's ρ (rho) ranks the data first and measures monotonic relationships, making it robust to outliers and appropriate for ordinal data like survey ratings.

Published May 16, 2026 · Last reviewed May 31, 2026