In this previous post, we introduced the notion of effective sample size. In the context of estimating a mean, if we draw in an independent and identically distributed (i.i.d.) manner from a distribution with variance
, the sample mean has variance
. When there is correlation between the
‘s, the effective sample size
is defined by the equation
In the Markov Chain Monte Carlo (MCMC) setting, the effective sample size is defined as
How are these two formulas related? I couldn’t find a reference that explained why is the correct formula for effective sample size in the MCMC context; any pointers would be great. (This blog post provides some context, but no derivation.)
Here’s my line of thought for how we can derive from
. From
,
If we replace the sum’s final index with
and replace all the
terms with
, we get the formula in
.
I think these two approximations can be pretty good in the MCMC context for a few reasons:
- We usually expect autocorrelations at very large lags to be close to zero. Thus, adding
terms for
probably doesn’t make much of a difference to the denominator, especially when
is pretty large.
- For many MCMC settings, the autocorrelations are usually all positive, so including more
creates a downward bias, if any. If we had to have any bias, this is probably in the right direction as it gives us a more conservative estimate of effective sample size.
- Replacing
by 1 creates a downward bias (the right direction), if any. Also, This approximation is very good for large values of
and small values of
. While the approximation is bad for large values of
,
is likely to be very close to zero in this case, so replacing
by 1 is no big deal.