F-Test

Comparing two variances using the ratio of sample variances, which follows the F-distribution under the null hypothesis of equal population variances.

Chi-square distribution โ€” right-tail p-value
ฯ‡ยฒ=6.0pโ‰ˆ0.199df=405101520
df=4
ฯ‡ยฒ=6.0
p-value โ‰ˆ 0.1992 โ†’ fail to reject Hโ‚€
Definition

The F-test tests whether two groups have the same variance, or whether a linear regression model has significant explanatory power.

The F-statistic is a ratio of two variance estimates:

F=s12s22F = \frac{s_1^2}{s_2^2}

where s12s_1^2 and s22s_2^2 are sample variances from two independent groups. Under H0H_0 (equal variances), FโˆผFn1โˆ’1,n2โˆ’1F \sim F_{n_1-1, n_2-1} (F-distribution with numerator and denominator degrees of freedom).

The F-distribution is right-skewed and always positive. Large FF (or small FF) is evidence against equal variances (two-sided test).

Key properties
  • The F-statistic is always non-negative โ€” it's a ratio of two non-negative variance estimates
  • Fd1,d2F_{d_1,d_2} and 1/Fd1,d21/F_{d_1,d_2} are related: swapping which variance goes in the numerator inverts the statistic and swaps the degrees of freedom
  • Assumes both samples come from normal populations โ€” unlike the t-test, the F-test is highly sensitive to this assumption
  • In regression, the F-test for overall significance and individual t-tests for coefficients are testing related but distinct hypotheses
Common mistakes
  • Using the classic F-test to check the equal-variance assumption before a t-test: this two-stage approach inflates the overall Type I error rate โ€” Welch's t-test (which doesn't assume equal variances) avoids the issue entirely
  • Applying the F-test to non-normal data: prefer Levene's test or Brown-Forsythe, which are more robust to violations of normality; Bartlett's test is powerful under normality but even more sensitive to non-normal data
Testing equal variances before a t-test

Two production lines: Line A (n=15n=15, sA2=8.2s_A^2=8.2), Line B (n=12n=12, sB2=3.1s_B^2=3.1).

F=8.2/3.1โ‰ˆ2.65F = 8.2/3.1 \approx 2.65 with (14,11)(14, 11) df. Critical values for two-sided test at ฮฑ=0.05\alpha=0.05: 0.35 and 3.09. Since 2.65<3.092.65 < 3.09, fail to reject equal variances.

Try it

Why is an F-test for equal variances (Levene's test is often preferred) important before running a pooled two-sample t-test?

Solution

The pooled t-test assumes equal variances โ€” it uses the pooled sample variance sp2s_p^2 as a common estimate. If variances actually differ substantially, this pooled estimate is inappropriate, and the t-statistic doesn't follow the t-distribution under H0H_0. The test becomes unreliable (wrong Type I error rate).

However, many statisticians now recommend using Welch's t-test by default (doesn't require equal variances), avoiding the need to pre-test. Pre-testing for equal variances inflates the overall Type I error rate of the subsequent t-test (two-stage testing).

Related concepts