I recently learned of a fairly succinct proof for the asymptotic distribution of the Pearson chi-square statistic (from Chapter 9 of Reference 1), which I share below.
First, the set-up: Assume that we have independent trials, and each trial ends in one of
possible outcomes, which we label (without loss of generality) as
. Assume that for each trial, the probability of the outcome being
is
. Let
denote that number of trials that result in outcome
, so that
. Pearson’s
-statistic is defined as
Theorem. As
,
, where
denotes convergence in distribution.
Before proving the theorem, we prove a lemma that we will use:
Lemma. Let
have distribution
. Then
has
distribution if and only if
is idempotent (i.e. a projection) with rank
.
(Note: We call a projection matrix if
is idempotent.)
Proof of Lemma: Since is real and symmetric, it is orthogonally diagonalizable, i.e. there is an orthogonal matrix
and a diagonal matrix
such that
. Let
. Since
,
. Furthermore,
. Thus,
We are now ready to prove the theorem.
Proof of Theorem (asymptotic distribution of Pearson statistic): For each
, let
denote the vector in
with all zeros except for a one in the
th entry. Let
be equal to
if the
th trial resulted in outcome
. Then
are i.i.d. with the multinomial distribution:
and
, where
Let . We can rewrite the Pearson
statistic as
By the Central Limit Theorem, , where
. Applying the Continuous Mapping Theorem,
If we define , then
and
. By the lemma, it remains to show that
is a projection matrix of rank
. We can write this matrix as
This matrix is a projection:
We can compute the trace of the matrix:
Since is a projection matrix and can be shown to be symmetric, its rank is equal to its trace (proof here), i.e. its rank is
. This completes the proof.
References:
- Ferguson, T. S. (1996). A Course in Large Sample Theory.
Pingback: General chi-square tests | Statistical Odds & Ends