Mann-Whitney Test
A non-parametric test comparing two groups by ranking all observations โ valid without normality assumptions.
The Mann-Whitney U test (also called the Wilcoxon rank-sum test) is a nonparametric test that compares two independent groups without assuming normality.
Null hypothesis: the two populations have the same distribution (or, in the location-shift version, the same median).
Procedure:
- Pool both groups and rank all observations (smallest = 1)
- Compute , where is the sum of ranks for group 1
- (they sum to )
- The test statistic is ; compare to the Mann-Whitney distribution (or use normal approximation for large )
- Uses only the rank order of observations, never their actual numeric values
- Valid for any continuous distribution โ no normality assumption required
- and always sum to exactly
- Nearly as powerful as the t-test under normality, and often more powerful when data is skewed or heavy-tailed
- Interpreting a significant result as "the medians differ": strictly, the test detects stochastic dominance โ if the two distributions differ in shape (not just location), "equal medians" isn't quite the right null hypothesis being tested
- Ignoring ties: many tied values require a correction factor in the normal approximation; ignoring it can distort the p-value
Group A (new UI): , Group B (old UI): .
Pool and rank: 1.9(A), 2.1(A), 3.4(A), 3.8(B), 4.0(A), 4.7(B), 5.2(B), 6.1(B).
, , .
. Small suggests Group A has generally smaller values (faster).
Why might you use the Mann-Whitney test instead of a two-sample t-test when comparing incomes of two professional groups?
Solution
Income distributions are typically right-skewed with heavy tails (a few very high earners). The two-sample t-test assumes (approximately) normal distributions, which fails badly with heavy-tailed skewed data โ the test statistic doesn't follow the t-distribution, and the Type I error rate is inflated.
The Mann-Whitney test uses only the ranks of the observations, not their actual values. It's insensitive to outliers and skewness. The test is valid for any continuous distribution โ no normality required.
Additionally, income data may have practical outliers (billionaires in the sample) that massively influence the mean but barely affect the rank ordering.