ANOVA
Analysis of variance โ comparing means across three or more groups by partitioning total variability into between-group and within-group components.
Drag the sliders to move group means. ANOVA asks: is the spread between groups larger than the spread within groups?
ANOVA (Analysis of Variance) tests whether the means of three or more groups are all equal.
: (all group means are equal)
: at least one mean differs
ANOVA decomposes total variability in data into:
- Between-group variance: how much group means vary from the grand mean
- Within-group variance: how much individual observations vary within their group
If between-group variance is much larger than within-group variance, we have evidence against .
- Total sum of squares always splits exactly into between-group and within-group components โ nothing is lost or double-counted
- Under , the F-statistic has expectation close to 1, since both numerator and denominator estimate the same noise variance
- A significant ANOVA result only establishes that some mean differs, not which pairs
- Assumes normality within groups, equal variances across groups, and independence of observations
- Running multiple pairwise t-tests instead of post-hoc corrected tests: this reintroduces the multiple-testing problem that ANOVA's single omnibus test was meant to avoid
- Ignoring unequal variances across groups: violating homoscedasticity distorts the F-test's Type I error rate โ check with Levene's test, or use Welch's ANOVA as a robust alternative
Three fertilizers applied to 5 plots each. Crop yields (kg): A: 23, B: 31, C: 23.
The grand mean is 24.6. Group means differ noticeably. ANOVA tests if these differences are larger than expected from random variation.
ANOVA uses the F-statistic . Why is F the right ratio to use? When would you expect F to be large vs. close to 1?
Solution
Under , both and estimate the same population variance , so their ratio is close to 1.
When is false (means differ), the between-group variance picks up the signal (differences between means), while the within-group variance still only estimates noise. So grows large. A large is evidence against .