F-Test
Comparing two variances using the ratio of sample variances, which follows the F-distribution under the null hypothesis of equal population variances.
The F-test tests whether two groups have the same variance, or whether a linear regression model has significant explanatory power.
The F-statistic is a ratio of two variance estimates:
where and are sample variances from two independent groups. Under (equal variances), (F-distribution with numerator and denominator degrees of freedom).
The F-distribution is right-skewed and always positive. Large (or small ) is evidence against equal variances (two-sided test).
- The F-statistic is always non-negative โ it's a ratio of two non-negative variance estimates
- and are related: swapping which variance goes in the numerator inverts the statistic and swaps the degrees of freedom
- Assumes both samples come from normal populations โ unlike the t-test, the F-test is highly sensitive to this assumption
- In regression, the F-test for overall significance and individual t-tests for coefficients are testing related but distinct hypotheses
- Using the classic F-test to check the equal-variance assumption before a t-test: this two-stage approach inflates the overall Type I error rate โ Welch's t-test (which doesn't assume equal variances) avoids the issue entirely
- Applying the F-test to non-normal data: prefer Levene's test or Brown-Forsythe, which are more robust to violations of normality; Bartlett's test is powerful under normality but even more sensitive to non-normal data
Two production lines: Line A (, ), Line B (, ).
with df. Critical values for two-sided test at : 0.35 and 3.09. Since , fail to reject equal variances.
Why is an F-test for equal variances (Levene's test is often preferred) important before running a pooled two-sample t-test?
Solution
The pooled t-test assumes equal variances โ it uses the pooled sample variance as a common estimate. If variances actually differ substantially, this pooled estimate is inappropriate, and the t-statistic doesn't follow the t-distribution under . The test becomes unreliable (wrong Type I error rate).
However, many statisticians now recommend using Welch's t-test by default (doesn't require equal variances), avoiding the need to pre-test. Pre-testing for equal variances inflates the overall Type I error rate of the subsequent t-test (two-stage testing).