Horvitz–Thompson estimator

Let’s say we have a finite population of N individuals, and we are interested in some trait that they have. Let X_i denote the value of the trait for individual i. We don’t get to see all these X_i‘s: we only sample n < N of them. With this sample of n individuals, we may be interested in obtaining an estimate of the total T = \sum_{i=1}^N X_i or the mean \tau = \frac{1}{N}\sum_{i=1}^N X_i.

Let’s add another wrinkle to our sampling scheme: we don’t know how we obtained it! (Maybe someone else gave it to us.) All we know is that the probability of individual i being included in the sample was \pi_i. Can we still come up with reasonable estimates for T and \tau?

It turns out that we can. In a 1952 paper, Daniel G. Horvitz and Donovan J. Thompson introduced what is now known as the Horvitz-Thompson estimator:

\hat{T}_{HT} = \displaystyle\sum_{i=1}^n \frac{X_i}{\pi_i}.

Note that the sum only goes over n terms, but it is an estimate for a sum over N terms. This estimator is performing inverse probability weighting: that is, we give each observation a weight which is the inverse of its probability of inclusion. The Horvitz-Thompson estimator is unbiased for T. The paper also worked out an expression for the estimator’s variance, but it’s substantially more complicated.

One potential application of this is if X_i = 1 for all individuals (i = 1, \dots, N). Here, we are just trying to estimate the size of the population N. If we knew the inclusion probabilities (a BIG if), then we could just use the Horvitz-Thompson estimator directly: \hat{T}_{HT} = \sum_{i=1}^n \frac{1}{\pi_i}. Usually we don’t know the \pi_i‘s, since they depend on the knowledge of N in some way! What we could do then is to get estimates \hat{\pi}_i for the inclusion probabilities, then use the plug-in principle to get the estimator \sum_{i=1}^n \frac{X_i}{\hat{\pi}_i}.

Credits: I learnt of this estimator through a talk Kristian Lum gave recently at the Stanford statistics seminar.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s