I recently learned of a fairly succinct proof for the asymptotic distribution of the Pearson chi-square statistic (from Chapter 9 of Reference 1), which I share below.

First, the set-up: Assume that we have independent trials, and each trial ends in one of possible outcomes, which we label (without loss of generality) as . Assume that for each trial, the probability of the outcome being is . Let denote that number of trials that result in outcome , so that . * Pearson’s -statistic* is defined as

Theorem.As , , where denotes convergence in distribution.

Before proving the theorem, we prove a lemma that we will use:

Lemma.Let have distribution . Then has distribution if and only if is idempotent (i.e. a projection) with rank .

(* Note:* We call a

*if is idempotent.)*

**projection matrix*** Proof of Lemma:* Since is real and symmetric, it is orthogonally diagonalizable, i.e. there is an orthogonal matrix and a diagonal matrix such that . Let . Since , . Furthermore, . Thus,

We are now ready to prove the theorem.

* Proof of Theorem (asymptotic distribution of Pearson statistic): *For each , let denote the vector in with all zeros except for a one in the th entry. Let be equal to if the th trial resulted in outcome . Then are i.i.d. with the multinomial distribution: and , where

Let . We can rewrite the Pearson statistic as

By the Central Limit Theorem, , where . Applying the Continuous Mapping Theorem,

If we define , then and . By the lemma, it remains to show that is a projection matrix of rank . We can write this matrix as

This matrix is a projection:

We can compute the trace of the matrix:

Since is a projection matrix and can be shown to be symmetric, its rank is equal to its trace (proof here), i.e. its rank is . This completes the proof.

References:

- Ferguson, T. S. (1996).
*A Course in Large Sample Theory*.

Pingback: General chi-square tests | Statistical Odds & Ends