What do we mean by effective sample size?

Let’s assume that we have some distribution F we want to estimate some quantity related to it (e.g. the mean of the distribution). A typical estimation strategy is to draw independent and identically distributed samples from F (i.e. X_1, \dots, X_n \stackrel{i.i.d.}{\sim} F), then plug those samples into an estimator T_n = T(X_1, \dots, X_n). The sample size is the number of samples we have: n.

But what happens if our samples are not i.i.d.? In an extreme example, imagine that we have samples X_1, \dots, X_n, but our sampling design forced the restriction X_1 = \dots = X_n. While it looks like we have n numbers, intuitive we know we really only have one sample.

Effective sample size makes this idea concrete. When X_1, \dots, X_n are i.i.d. from F, the estimator has a certain variance \text{Var}(T_n). If our samples come from some other sampling design, we will have a different expression for the variance. Effective sample size is the number n_{eff} such that when we compute the estimator based on n samples from our sampling design, the resulting variance is \text{Var}(T_{n_{eff}}).

While different estimators could give rise to different effective sample sizes, estimation of the mean via the sample mean is so common that when someone talks about effective sample size, it is almost always with respect to this estimator. In this setting, if \text{Var}(X_i) = \sigma^2, then

\begin{aligned} \text{Var}(T_n) = \text{Var} \left(\dfrac{X_1 + \dots X_n}{n} \right)  = \dfrac{\sigma^2}{n}. \end{aligned}

If our X_i‘s are not drawn from the i.i.d. sampling design, then the effective sample size is n_{eff} such that

\begin{aligned}\text{Var}(T_n) = \dfrac{\sigma^2}{n_{eff}}. \end{aligned}

The notion of effective sample size comes up frequently in two contexts: when observations are correlated (e.g. time series data or Markov-chain Monte Carlo (MCMC) simulation) or weighted.

Correlated observations

When observations are correlated, we have

\begin{aligned}\dfrac{\sigma^2}{n_{eff}} &= \text{Var}(T_n) \\  &= \sum_{i=1}^n \dfrac{\text{Var}(X_i)}{n^2} + 2 \sum_{1 \leq i < j \leq n} \dfrac{\text{Cov}(X_i, X_j)}{n^2} \\  &= \dfrac{\sigma^2}{n} + \dfrac{2}{n^2} \sum_{1 \leq i < j \leq n} \text{Cov}(X_i, X_j). \end{aligned}

Hence, we need expressions for the covariance to compute effective sample size.

Weighted observations

In this setting, the observations are i.i.d. but each observation X_i is given an observation weight w_i \geq 0. The weights are incorporated into the computation of the statistic (see this post for what we mean by this). These weights should affect our sample size. Taking an extreme example, if w_1 = 1 and w_2 = \dots = w_n = 0, our estimator is really using only X_1 and not the other samples, so the effective sample size should be 1.

In the context of estimating the mean, our weighted estimator is the weighted mean

\begin{aligned} \widetilde{T}_n &= \dfrac{\sum_{i=1}^n w_i X_i}{\sum_{i=1}^n w_i}. \end{aligned}

Since the observations are i.i.d., we have

\begin{aligned} \dfrac{\sigma^2}{n_{eff}} &= \dfrac{\sum_{i=1}^n \text{Var}(w_i X_i) }{ \left( \sum_{i=1}^n w_i \right)^2 } = \dfrac{\sigma^2 \sum_{i=1}^n w_i^2}{ \left( \sum_{i=1}^n w_i \right)^2 }, \\  n_{eff} &= \dfrac{\left( \sum_{i=1}^n w_i \right)^2}{\sum_{i=1}^n w_i^2}. \end{aligned}

References:

  1. Wikipedia. Effective sample size.
Advertisement

1 thought on “What do we mean by effective sample size?

  1. Pingback: Effective sample size for Markov Chain Monte Carlo | Statistical Odds & Ends

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s