Let’s say we have a finite population of individuals, and we are interested in some trait that they have. Let denote the value of the trait for individual . We don’t get to see all these ‘s: we only sample of them. With this sample of individuals, we may be interested in obtaining an estimate of the total or the mean .
Let’s add another wrinkle to our sampling scheme: we don’t know how we obtained it! (Maybe someone else gave it to us.) All we know is that the probability of individual being included in the sample was . Can we still come up with reasonable estimates for and ?
Note that the sum only goes over terms, but it is an estimate for a sum over terms. This estimator is performing inverse probability weighting: that is, we give each observation a weight which is the inverse of its probability of inclusion. The Horvitz-Thompson estimator is unbiased for . The paper also worked out an expression for the estimator’s variance, but it’s substantially more complicated.
One potential application of this is if for all individuals (). Here, we are just trying to estimate the size of the population . If we knew the inclusion probabilities (a BIG if), then we could just use the Horvitz-Thompson estimator directly: Usually we don’t know the ‘s, since they depend on the knowledge of in some way! What we could do then is to get estimates for the inclusion probabilities, then use the plug-in principle to get the estimator .
Credits: I learnt of this estimator through a talk Kristian Lum gave recently at the Stanford statistics seminar.