In the previous post, we introduced the notion of leverage in linear regression. If we have a response vector and design matrix
, the hat matrix is defined as
, and the leverage of data point
is the
th diagonal entry of
, which we denote by
. It is so called because it is a measure of the influence that
has on its own prediction
:
. The higher the leverage, the more influence
has on its own prediction.
It turns out that leverage must satisfy the following bounds:
for all
.
This is easy to prove using the following 2 facts:
- Note that
, i.e.
is idempotent.
- Note that
, i.e.
is symmetric.
Since , using the symmetry of
it follows that
- Since the RHS is a sum of square numbers,
.
- Since the second term on the RHS is a sum of square numbers,
, and since
, we have
.
There is also a constraint on the sum of leverages which is easy to derive. By the cyclic property of the trace operator,