Power method for obtaining the top eigenvector

The power method (also known as power iteration) is a simple algorithm for obtaining the “top” eigenvector and eigenvalue of a matrix.

Let A be an n \times n matrix with real-valued entries. Assume that:

  1. A is diagonalizable, or equivalently, A has n linearly independent eigenvectors v_1, \dots, v_n. (From here on, I assume that the v_i‘s have unit norm.)
  2. We can order the eigenvectors such that the eigenvalues associated with them (i.e. \lambda_i such that Av_i = \lambda_i v_i) satisfy the ordering |\lambda_1 | > |\lambda_2| \geq \dots \geq |\lambda_n|. What is important here is the strict inequality between |\lambda_1 | and |\lambda_2|.

The power method can be described as follows:

  1. Initialize b_0 \in \mathbb{R}^n to be some non-zero vector.
  2. For each t = 0, 1, \dots, set b_{k+1} = \dfrac{Ab_k}{\| Ab_k \|}, and set \mu_{k+1} = b_{k+1}^T A b_{k+1}.

If A satisfies the two assumptions above and b_0 has a non-zero component in the direction of v_1, then b_k converges to v_1, and \mu_k converges to \lambda_1. (This reference points out that if \lambda_1 is negative, \{ b_k \} might end up “flip-flopping” between v_1 and -v_1. One way to avoid this is to always insist on b_k having a non-negative first entry.)

The proof is pretty simple and can be found on the first page of the second reference. Section 6.2 of the third reference has some practical tips for how to use the power method in practice.

Can we say something about the convergence rate of the power method? In essence, the convergence rate depends on how close |\lambda_1| is to |\lambda_2|: the larger \left| \dfrac{\lambda_1}{\lambda_2} \right| is, the faster the power method converges. Theorem 5.6 of the last reference makes this precise:

Write our initialization b_0 as a linear combination of the eigenvectors: b_0 = \sum_{i=1}^n \alpha_i v_i. Assume that \alpha_1 \neq 0. Then there exists a constant C > 0 such that

\begin{aligned} \| b_k - v_1 \|_2 \leq C \left| \frac{\lambda_2}{\lambda_1} \right|^k \end{aligned} for all k \geq 1.

What if we are interested in eigenvalues/eigenvectors other than the top ones? We can use the power method to find \lambda_1 and v_1. It is then easy to show that the matrix

A' = A - \lambda_1 v_1 v_1^T

has eigenvalues 0, \lambda_2, \dots, \lambda_n and the same eigenvectors as A. We can then apply the power method to A' to get \lambda_2 and v_2, and so on. This known as the power method with deflation. It should be noted that the deflation method becomes more inaccurate as we compute smaller and smaller eigenvalues/eigenvectors, because the errors of approximation for the earlier eigenvalues/eigenvectors add up.

References:

  1. Wikipedia. Power iteration.
  2. The Power Method.
  3. Numerical calculation of eigenvalues.
  4. Lambers, J. (2009-10). The Eigenvalue Problem: Power Iterations.
  5. https://math.unice.fr/~frapetti/CorsoF/cours4part2.pdf
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s