# General chi-square tests

In this previous post, I wrote about the asymptotic distribution of the Pearson $\chi^2$ statistic. Did you know that the Pearson $\chi^2$ statistic (and the related hypothesis test) is actually a special case of a general class of $\chi^2$ tests? In this post we describe the general $\chi^2$ test. The presentation follows that in Chapters 23 and 24 of Ferguson (1996) (Reference 1). I’m leaving out the proofs, which can be found in the reference.

(Warning: This post is going to be pretty abstract! Nevertheless, I think it’s worth a post since I don’t think the idea is well-known.)

Let’s define some quantities. Let $Z_1, Z_2, \dots \in \mathbb{R}^d$ be a sequence of random vectors whose distribution depends on a $k$-dimensional parameter $\theta$ which lies in a parameter space $\Theta$. $\Theta$ is assumed to be a non-empty open subset of $\mathbb{R}^k$, where $k \leq d$. Next, assume that the $Z_n$ are asymptotically normal, i.e. there exist $A(\theta) \in \mathbb{R}^d$ and covariance matrix $C(\theta) \in \mathbb{R}^{d \times d}$ such that \begin{aligned} \sqrt{n} \left( Z_n - A(\theta) \right) \stackrel{d}{\rightarrow} \mathcal{N} \left( 0, C(\theta)\right). \end{aligned}

Next, let $M(\theta) \in \mathbb{R}^{d \times d}$ be some covariance matrix, and define the quadratic form \begin{aligned} Q_n(\theta) = n \left( Z_n - A(\theta) \right)^\top M(\theta) \left( Z_n - A(\theta) \right). \end{aligned}

We make 3 assumptions about $A(\theta)$ and $M(\theta)$:

• $A(\theta)$ is bicontinuous, i.e. $\theta_n \rightarrow \theta \;\Leftrightarrow \; A(\theta_n) \rightarrow A(\theta)$.
• $A(\theta)$ has a continuous first partial derivative, $\dot{A}(\theta)$, of full rank $k$.
• $M(\theta)$ is continuous is $\theta$ and is uniformly bounded below, in the sense that there is a constant $\alpha > 0$ such that $M(\theta) > \alpha I$ for all $\theta \in \Theta$.

Definition. A minimum $\chi^2$ estimate is a value of $\theta$, depending on $Z_n$, that minimizes $Q_n(\theta)$. A sequence $\theta_n^* (Z_n)$ is a minimum $\chi^2$ sequence if \begin{aligned} Q_n(\theta_n^*) - \inf_{\theta \in \Theta} Q_n(\theta) \stackrel{P}{\rightarrow} 0, \end{aligned}

whatever the true value of $\theta \in \Theta$. $Q_n(\theta_n^*)$ is going to be the statistic in our hypothesis test.

Let $\theta_0$ denote the true value of the parameter, and let $\dot{A}$, $M$ and $C$ denote $\dot{A}(\theta_0)$, $M(\theta_0)$ and $C(\theta_0)$ respectively. The first theorem states that minimum $\chi^2$ sequences are asymptotically normal with a specific mean and covariance:

Theorem 1. For any minimum $\chi^2$ sequence $\left\{ \theta_n^* \right\}$, $\sqrt{n} \left( \theta_n^* - \theta_0 \right) \stackrel{d}{\rightarrow} \mathcal{N}(0, \Sigma)$, where \begin{aligned} \Sigma = \left( \dot{A}^\top M \dot{A} \right)^{-1} \dot{A}^\top MCM\dot{A} \left( \dot{A}^\top M \dot{A} \right)^{-1}. \end{aligned}

The theorem above holds for any covariance matrix $M$ (that satisfies the assumption we make on it). We are interested in finding the matrix $M$ that makes the asymptotic covariance matrix above the smallest. The corollary below tells us what this $M$ is:

Corollary. If there is a non-singular $M_0 \in\mathbb{R}^{d \times d}$ such that $CM_0 \dot{A} = \dot{A}$, then $\Sigma (M_0) = \left( \dot{A}^\top M_0 \dot{A} \right)^{-1}$. Moreover, $\Sigma(M_0) \leq \Sigma(M)$ for all $M$.

It can be shown that we can take $M_0$ to be any generalized inverse of $C$. If $C$ is non-singular, then we can take $M_0 = C^{-1}$. For this choice of $M_0$, the next theorem gives the asymptotic distribution of $Q_n(\theta_n^*)$:

Theorem 2. $Q_n(\theta_n^*) \stackrel{d}{\rightarrow} \chi_{v - k}^2$, where $v$ is the rank of $C(\theta_0)$.

Application to Pearson’s $\chi^2$

Whew! That was a lot. Let’s see how Pearson’s $\chi^2$ is a special case of this. (You might want to have the previous post open for reference.) For Pearson’s $\chi^2$,

• $d = J$, the number of possible outcomes for each trial.
• $Z_n = \overline{X}_n$, the vector of relative cell frequencies.
• $A(\theta) = p$, the vector of cell probabilities, written as a function of some $k$-dimensional parameter $\theta$, with $k \leq J-1$.
• $C(\theta) = P(\theta) - p(\theta) p(\theta)^\top$.
• To make the expression $Q_n(\theta)$ equal to Pearson’s $\chi^2$ statistic, we have to take $M(\theta) = P(\theta)^{-1}$.
• It can be shown that $P(\theta)^{-1}$ is a generalized inverse of $C(\theta)$. Hence, Theorem 2 applies.

Other applications of general chi-square theory

This general chi-square theory has been used in recent years to construct hypothesis tests that work under various forms of privatized data: see References 2-4 (Reference 4 just came out this week!). There are probably other applications of this theory: if you know of any, please share!

References:

1. Ferguson, T. S. (1996). A Course in Large Sample Theory.
2. Kifer, D., and Rogers, R. (2017). A New Class of Private Chi-Square Hypothesis Tests.
3. Gaboardi, M., and Rogers, R. (2018). Local Private Hypothesis Testing: Chi-Square Tests.
4. Friedberg, R., and Rogers, R. (2022). Privacy Aware Experimentation over Sensitive Groups: A General Chi Square Approach.