leverage | Statistical Odds & Ends

In the previous post, we introduced the notion of leverage in linear regression. If we have a response vector $y \in \mathbb{R}^n$ and design matrix $X \in \mathbb{R}^{n \times p}$ , the hat matrix is defined as $H = X(X^TX)^{-1}X^T$ , and the leverage of data point $i$ is the $i$ th diagonal entry of $H$ , which we denote by $H_{ii}$ . It is so called because it is a measure of the influence that $y_i$ has on its own prediction $\hat{y}_i$ : $\text{Cov}(\hat{y}_i, y_i) = \sigma^2 H_{ii}$ . The higher the leverage, the more influence $y_i$ has on its own prediction.

It turns out that leverage must satisfy the following bounds:

$0 \leq H_{ii} \leq 1$ for all $i = 1, \dots, n$ .

This is easy to prove using the following 2 facts:

Note that $H^2 =X(X^TX)^{-1}X^TX(X^TX)^{-1}X^T =X(X^TX)^{-1}X^T = H$ , i.e. $H$ is idempotent.
Note that $H^T = H$ , i.e. $H$ is symmetric.

Since $H_{ii} = (H^2)_{ii}$ , using the symmetry of $H$ it follows that

$H_{ii} = H_{ii}^2 + \displaystyle\sum_{j \neq i} H_{ij}^2.$

Since the RHS is a sum of square numbers, $H_{ii} \geq 0$ .
Since the second term on the RHS is a sum of square numbers, $H_{ii} \geq H_{ii}^2$ , and since $H_{ii} \geq 0$ , we have $1 \geq H_{ii}$ .

There is also a constraint on the sum of leverages which is easy to derive. By the cyclic property of the trace operator,

$\begin{aligned} \sum_{i=1}^n H_{ii} &= \text{tr}(H) = \text{tr} [X(X^TX)^{-1}X^T] \\ &= \text{tr}[X^T X (X^T X)^{-1}] \\ &= \text{tr}(I_p) = p. \end{aligned}$

Statistical Odds & Ends

Tag Archives: leverage

Bounds/constraints on leverage in linear regression