Let’s assume that we have some distribution we want to estimate some quantity related to it (e.g. the mean of the distribution). We can write the quantity we want to estimate (the “estimand”) as a function of : for some function . (We use to denote both the distribution and its cumulative distribution function (CDF).)

Here is a common estimation strategy: if we can draw samples from , let’s draw . These samples determine an * empirical CDF* , which simply puts a weight of at each of the ‘s. We can then estimate with . This is known as the plug-in estimator, since we are “plugging in” the empirical CDF for the true CDF.

* What if we can’t draw samples from , but can only draw samples from some other distribution , i.e. ?* Estimating is not totally a lost cause if we can find observation weights that sum up to 1 such that the implied empirical CDF is close to or . By implied empirical CDF, I mean the distribution putting weight of at for . If we denote the implied empirical CDF by , then would be a reasonable estimator for .

The discussion above is pretty theoretical, so let’s look at the implications of the discussion above for a few examples.

**Example 1: Weighted estimator for the mean**

We can write the mean as . The plug-in estimator is

which is simply the sample mean. With observation weights, the estimator becomes

which we recognize as the weighted sample mean.

**Example 2: Weighted estimator for the variance**

We can write the variance as . It follows that the plug-in estimator is

and the weighted estimator is

**Example 3: Weighted least squares**

In this setting, the distribution is the joint distribution of the covariates and the response variable . The (population) regression coefficient that we want to estimate is

If we draw samples with the ‘s thought of as column vectors, and if we let be the matrix with rows being the ‘s and being the column vector of the ‘s, then the plug-in estimator is

which you might recognize as the usual ordinary least squares (OLS) estimator. If we have observation weights, then we have the weighted least squares estimator

where is the diagonal matrix with diagonal entries . (This is the same formula as the one I presented in this previous post on weighted least squares.)

**Example 4: Weighted Atkinson index**

In this previous post, we introduced the Atkinson index as a measure of inequality for a given distribution. In that post, what we presented was actually the plug-in estimator for the Atkinson index. Assume that the inequality-aversion parameter is not equal to 1. The Atkinson index for a distribution is defined as

If we replace with , we get

which is the formula I presented in the previous post. We can get a weighted Atkinson index by replacing with :

(As far as I can tell, this formula hasn’t appeared anywhere before, and none of the functions in R which compute weighted Atkinson index use this formula.)

Pingback: What do we mean by effective sample size? | Statistical Odds & Ends