Fieller’s confidence sets for ratio of two means

Assume you have a sample $(X_1, Y_1), \dots, (X_n, Y_n)$ drawn i.i.d. from some underlying distribution. Let $\mathbb{E}[X] = \mu_1$ and $\mathbb{E}[Y] = \mu_2$. Our goal is to estimate the ratio of means $\rho := \mu_2 / \mu_1$ and provide a level-$(1-\alpha)$ confidence set for it, i.e. a set $R$ such that $\mathbb{P}_\rho \{ \rho \in R \} \geq 1 - \alpha$. Fieller’s confidence sets are one way to do this. (The earliest reference to Fieller’s work is listed as Reference 1. The exposition here follows that in von Luxburg & Franz 2009 (Reference 2).)

Define the standard estimators for the means and covariances:

\begin{aligned} \hat\mu_1 &:= \dfrac{1}{n}\sum_{i=1}^n X_i, \quad \hat\mu_2 := \dfrac{1}{n}\sum_{i=1}^n Y_i, \\ \hat{c}_{11} &:= \dfrac{1}{n(n-1)}\sum_{i=1}^n (X_i - \hat\mu_1)^2, \\ \hat{c}_{22} &:= \dfrac{1}{n(n-1)}\sum_{i=1}^n (Y_i - \hat\mu_2)^2, \\ \hat{c}_{12} &:= \hat{c}_{21} := \dfrac{1}{n(n-1)} \sum_{i=1}^n (X_i - \hat\mu_1)(Y_i - \hat\mu_2). \end{aligned}

Let $q := q(t_{n-1}, 1 - \alpha/2)$ be the $(1 - \alpha/2)$-quantile of the $t$ distribution with $n-1$ degrees of freedom. Compute the quantities

\begin{aligned} q_\text{exclusive}^2 &= \dfrac{\hat\mu_1^2}{\hat{c}_{11}}, \\ q_\text{complete}^2 &= \dfrac{\hat\mu_2^2 \hat{c}_{11} - 2\hat\mu_1 \hat\mu_2 \hat{c}_{12} + \hat\mu_1^2 \hat{c}_{22} }{\hat{c}_{11}\hat{c}_{22} - \hat{c}_{12}^2}, \end{aligned}

and

\begin{aligned} \ell_{1, 2} = \dfrac{1}{\hat\mu_1^2 - q^2 \hat{c}_{11}} \left[ (\hat\mu_1 \hat\mu_2 - q^2 \hat{c}_{12}) \pm \sqrt{(\hat\mu_1 \hat\mu_2 - q^2 \hat{c}_{12})^2 - (\hat\mu_1^2 - q^2 \hat{c}_{11})(\hat\mu_2^2 - q^2 \hat{c}_{22})} \right] .\end{aligned}

The confidence set $R_\text{Fieller}$ is defined as follows:

\begin{aligned} R_\text{Fieller} = \begin{cases} (-\infty, \infty) &\text{if } q_\text{complete}^2 \leq q^2, \\ (-\infty, \min(\ell_1, \ell_2)] \cup [ \max(\ell_1, \ell_2), \infty) &\text{if } q_\text{exclusive}^2 < q^2 < q_\text{complete}^2, \\ [\min(\ell_1, \ell_2), \max(\ell_1, \ell_2) ] &\text{otherwise.} \end{cases} \end{aligned}

The following result is often known as Fieller’s theorem:

Theorem (Fieller). If $(X, Y)$ is jointly normal, then  $R_{Fieller}$ is an exact confidence region of level $1 - \alpha$ for $\rho$, i.e. $\mathbb{P}_\rho \{ \rho \in R \} = 1 - \alpha$.

This is Theorem 3 of Reference 2, and there is a short proof of the result there. Of course, if $(X, Y)$ is not jointly normal (which is almost always the case), then Fieller’s confidence sets are no longer exact but approximate.

Fieller’s theorem is also valid in the more general setting where we are given two independent samples $X_1, \dots, X_n$ and $Y_1, \dots, Y_m$ (rather than paired samples), and use unbiased estimators for the means and independent unbiased estimators for the covariances. Reference 2 notes that the degrees of freedom for the $t$ distribution needs to be chosen appropriately in this case.

I like the way Reference 2 explicitly lays out the 3 possibilities for the Fieller confidence set:

\begin{aligned} R_\text{Fieller} = \begin{cases} (-\infty, \infty) &\text{if } q_\text{complete}^2 \leq q^2, \\ (-\infty, \min(\ell_1, \ell_2)] \cup [ \max(\ell_1, \ell_2), \infty) &\text{if } q_\text{exclusive}^2 < q^2 < q_\text{complete}^2, \\ [\min(\ell_1, \ell_2), \max(\ell_1, \ell_2) ] &\text{otherwise.} \end{cases} \end{aligned}

The first case corresponds to the setting where both $\mathbb{E}[X]$ and $\mathbb{E}[Y]$ are close to 0: here, we can’t really conclude anything about the ratio. The second case corresponds to the setting where $\mathbb{E}[Y]$ is close to zero while $\mathbb{E}[X]$ is not: here the ratio is something like $C/\epsilon$ or $-C/\epsilon$ where $C$ is a big constant while $\epsilon$ is small. The final case corresponds to the setting where both quantities are not close to zero. The Wikipedia article for Fieller’s theorem describes the confidence set using a single formula. Even though all 3 cases above are contained within this formula, I found the formula a little misleading because on the surface it looks like the confidence set is always of the form $[a, b]$ for some real numbers $a$ and $b$.

References:

1. Fieller, E. C. (1932). The distribution of the index in a normal bivariate population.
2. Von Luxburg, U., and Franz, V. H. (2009). A geometric approach to confidence sets for ratios: Fieller’s theorem, generalizations and bootstrap.