Simpson's Paradox: Why Adding Up the Numbers Can Make Them Lie

Men are more likely to get into UC Berkeley's graduate programs than women. Except they aren't. The same data set proves both things simultaneously, which shouldn't be possible but is. Welcome to Simpson's paradox, the statistical glitch that upends how we interpret everything from medical treatments to hiring bias.

Here's the intuitive story we tell ourselves: If a number is true for every group individually, it must be true when you add all the groups together. If men have higher admission rates in engineering and higher rates in English and higher rates in biology, then men must have higher rates overall. Math, right? It feels obvious enough that we don't even question it. Millions of people rely on this logic every day when reading studies about medicine, economics, politics, or literally any field where aggregate statistics matter.

Except the logic is broken. Simpson's paradox is a phenomenon in probability and statistics in which a trend appears in several groups of data but disappears or reverses when the groups are combined. The classic case comes from the very place where you'd expect the most rigorous data handling: academic research. In 1973, UC Berkeley faced sex discrimination lawsuits over graduate admissions. The numbers seemed damning: across the university, men had a higher overall admission rate than women. But when researchers disaggregated the data by department, something strange happened. In the famous UC Berkeley graduate admissions case, men appeared to have higher overall admission rates, but when examining individual departments, women actually had equal or higher rates—the reverse of the aggregated data.

How does this work? The twist was hidden in department selectivity. Women were disproportionately applying to highly competitive programs like English and psychology, where admission rates were low for everyone. Men were more likely to apply to less selective departments like engineering, where most applicants got in. When you weighted the data by department size and competitiveness, the paradox dissolved—but only if you looked hard enough. The UC Berkeley case didn't prove discrimination; it proved how invisible context can hide inside aggregate numbers.

This happens because aggregation erases information about what's actually driving the numbers. A treatment can appear less effective overall while being more effective in every subgroup if the subgroups differ in size or severity of illness. A hiring process can appear fair in aggregate while masking discrimination within specific roles if those roles attract different applicant pools. The paradox doesn't require fraud or intentional manipulation—just unequal distribution of the thing you're measuring across the categories you're measuring.

The implications ripple through fields that rely on aggregated data to make decisions. Medical researchers comparing drug effectiveness across multiple patient demographics have to be careful; a treatment might look worse overall but better for every individual demographic due to Simpson's paradox. Economists analyzing wage gaps, promotion rates, or hiring patterns can reach opposite conclusions depending on how finely they slice the data. Simpson's paradox is a phenomenon in probability and statistics in which a trend appears in several groups of data but disappears or reverses when the groups are combined, which means that every aggregate statistic you read should come with an implicit question: What groups are being hidden inside this number?

The lesson isn't that statistics lie—it's that they're stupidly easy to misread if you don't interrogate them. The same data that appears to show discrimination can actually show fairness. The same data that appears to show efficacy can hide ineffectiveness in certain subgroups. Before you believe a number, ask: What's being combined here that maybe shouldn't be?