Welch’s t-test and the Welch-Satterthwaite equation

Welch’s t-test is probably the most commonly used hypothesis test for testing whether two populations have the same mean. Welch’s t-test is generally preferred over Student’s two-sample t-test: while both assume that the population of the two groups are normal, Student’s t-test assumes that the two populations have the same variance while Welch’s t-test does not make any assumption on the variances.

Assume we have n_1 samples from group 1 and n_2 samples from group 2. For j = 1, 2, let \overline{X}_j and s_j^2 denote the sample mean and sample variance of group j respectively. Welch’s t-statistic is defined by

\begin{aligned} t = \dfrac{\overline{X}_1 - \overline{X}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}. \end{aligned}

Under the null hypothesis, t is approximately distributed as the t-distribution with degrees of freedom

\begin{aligned} \nu = \left( \frac{s_1^2}{n_1} + \frac{s_2^2}{n_2} \right)^2 \bigg/ \left( \frac{s_1^4}{n_1^2 (n_1 - 1)} + \frac{s_2^4}{n_2^2 (n_2 - 1)} \right). \end{aligned}

The equation above is known as the Welch-Satterthwaite equation.

Welch-Satterthwaite equation

Where does the Welch-Satterthwaite equation come from? The commonly cited reference for this is F. E. Satterthwaite’s 1946 paper (Reference 1). Satterthwaite tackles a more general problem: let s_1^2, \dots, s_J^2 be J independent sample variances each having r_j degrees of freedom. Consider the combined estimate

\hat{V}_s = a_1 s_1^2 + \dots + a_J s_J^2,

where a_1, \dots, a_J are constants. (Back then, this was called a complex estimate of variance.) The exact distribution of \hat{V}_s is too difficult to compute (no simple closed form? I’m not sure), so we approximate it with a chi-squared distribution that has the same variance as \hat{V}_s. The question is, what is the degrees of freedom for this approximating chi-squared distribution?

The paper states that the number of degrees of freedom is

\begin{aligned} r_s = \left( \sum_{j=1}^J a_j \mathbb{E}[s_j^2] \right)^2 \bigg/ \left( \sum_{j=1}^J \dfrac{(a_j \mathbb{E}[s_j^2])^2}{r_j} \right). \end{aligned}

In practice we don’t know the value of the expectations, so we replace them with the observed values:

\begin{aligned} \hat{r}_s = \left( \sum_{j=1}^J a_j s_j^2 \right)^2 \bigg/ \left( \sum_{j=1}^J \dfrac{(a_j s_j^2)^2}{r_j} \right). \end{aligned}

The degrees of freedom for Welch’s t-test is this formula above with J = 2, r_j = n_j - 1 and a_j = 1 / n_j.

Heuristic argument

Satterthwaite’s 1946 paper is the commonly cited reference, but that paper actually contains just the formulas and not the heuristic argument. For that, we have to go to his 1941 paper (Reference 2).

Consider first the simple case where we have just two independent sample variances s_1^2 and s_2^2, each with degrees of freedom r_1 and r_2. Let’s compute the degrees of freedom for the chi-squared distribution that approximates V = s_1^2 + s_2^2.

For j = 1, 2, let \sigma_j^2 = \mathbb{E}[s_j^2]. From our set-up, we have \dfrac{r_j s_j^2}{\sigma_j^2} \sim \chi_{r_j}^2, thus

\begin{aligned} \text{Var} (s_j^2) &= \left( \frac{\sigma_j^2}{r_j} \right)^2  \mathbb{E} [\chi_{r_j}^2] = \left( \frac{\sigma_j^2}{r_j} \right)^2  \cdot (2 r_j) \\  &= \frac{2\sigma_j^4}{r_j} \quad \text{for } j = 1, 2, \\  \text{Var} (V) &= 2\left( \frac{\sigma_1^4}{r_1} + \frac{\sigma_2^4}{r_2} \right), \end{aligned}

where the last equality holds because s_1^2 and s_2^2 are independent. On the other hand, to approximate V be a chi-squared distribution with r degrees of freedom means that \dfrac{r V}{\mathbb{E}[V]} \sim \chi_r^2. Under this approximation,

\text{Var}(V) = \left(\dfrac{\mathbb{E}[V]}{r} \right)^2 \cdot (2 r) = \dfrac{2(\sigma_1^2 + \sigma_2^2)^2}{r}.

For this approximation to be good, we want the variance obtained under the chi-squared approximation to be the same as the true variance. Hence, we have

\begin{aligned} \dfrac{2(\sigma_1^2 + \sigma_2^2)^2}{r} &= 2\left( \frac{\sigma_1^4}{r_1} + \frac{\sigma_2^4}{r_2} \right), \\  r &= (\sigma_1^2 + \sigma_2^2) \bigg/ \left( \frac{\sigma_1^4}{r_1} + \frac{\sigma_2^4}{r_2} \right). \end{aligned}

The argument above is perfectly general: we can rerun it to get the effective degrees of freedom for V = a_1 s_1^2 + \dots + a_J s_J^2.


  1. Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components.
  2. Satterthwaite, F. E. (1941). Synthesis of variance.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s