Let’s assume that we have some distribution we want to estimate some quantity related to it (e.g. the mean of the distribution). We can write the quantity we want to estimate (the “estimand”) as a function of
:
for some function
. (We use
to denote both the distribution and its cumulative distribution function (CDF).)
Here is a common estimation strategy: if we can draw samples from , let’s draw
. These samples determine an empirical CDF
, which simply puts a weight of
at each of the
‘s. We can then estimate
with
. This is known as the plug-in estimator, since we are “plugging in” the empirical CDF for the true CDF.
What if we can’t draw samples from , but can only draw samples from some other distribution
, i.e.
? Estimating
is not totally a lost cause if we can find observation weights
that sum up to 1 such that the implied empirical CDF is close to
or
. By implied empirical CDF, I mean the distribution putting weight of
at
for
. If we denote the implied empirical CDF by
, then
would be a reasonable estimator for
.
The discussion above is pretty theoretical, so let’s look at the implications of the discussion above for a few examples.
Example 1: Weighted estimator for the mean
We can write the mean as . The plug-in estimator is
which is simply the sample mean. With observation weights, the estimator becomes
which we recognize as the weighted sample mean.
Example 2: Weighted estimator for the variance
We can write the variance as . It follows that the plug-in estimator is
and the weighted estimator is
Example 3: Weighted least squares
In this setting, the distribution is the joint distribution of the covariates
and the response variable
. The (population) regression coefficient that we want to estimate is
If we draw samples with the
‘s thought of as column vectors, and if we let
be the matrix with rows being the
‘s and
being the column vector of the
‘s, then the plug-in estimator is
which you might recognize as the usual ordinary least squares (OLS) estimator. If we have observation weights, then we have the weighted least squares estimator
where is the diagonal matrix with diagonal entries
. (This is the same formula as the one I presented in this previous post on weighted least squares.)
Example 4: Weighted Atkinson index
In this previous post, we introduced the Atkinson index as a measure of inequality for a given distribution. In that post, what we presented was actually the plug-in estimator for the Atkinson index. Assume that the inequality-aversion parameter is not equal to 1. The Atkinson index for a distribution
is defined as
If we replace with
, we get
which is the formula I presented in the previous post. We can get a weighted Atkinson index by replacing with
:
(As far as I can tell, this formula hasn’t appeared anywhere before, and none of the functions in R which compute weighted Atkinson index use this formula.)
Pingback: What do we mean by effective sample size? | Statistical Odds & Ends