Quasi-Newton methods: SR1

Introduction

In this previous post, we described how Quasi-Newton methods can be used to minimize a twice-differentiable function $f$ whose domain is all of $\mathbb{R}^n$ . At each iteration of the method, we take the following steps:

Solve for $p_k$ in $B_k p_k = - \nabla f(x_k)$ .
Update $x$ with $x_{k+1} = x_k + t_k p_k$ .
Update $B$ according to some procedure.

Many methods update $B$ so that it satisfies the secant equation:

$\begin{aligned} B_{k+1} s_k = y_k, \end{aligned}$

where $s_k = x_{k+1} - x_k$ and $y_k = \nabla f(x_{k+1}) - \nabla f(x_k)$ . This is often written as $B^+ s = y$ for brevity. Because this is a system of $n$ equations with $n \times n$ unknowns, there are many ways to update $B$ .

Symmetric rank-one update (SR1)

One way to update $B$ easily is to assume that the next $B$ is the current $B$ plus a symmetric rank-one matrix, i.e.

$B^+ = B + auu^\top$

for some $a \in \mathbb{R}$ and $u \in \mathbb{R}^n$ . Plugging this update into the secant equation and rearranging, we get

$(au^\top s) u = y - Bs,$

which implies that $u$ must be a multiple of $y - Bs$ . Plugging this into the equation and solving for $a$ , we obtain the update

$\begin{aligned} B^+ = B + \dfrac{(y - Bs)(y - Bs)^\top}{(y - Bs)^\top s}. \end{aligned}$

Recall from the quasi-Newton updates that we need to solve the equation $B p = - \nabla f(x)$ for $p$ , which means that we need to compute the inverse $H = B^{-1}$ to get $p = - H \nabla f(x)$ . Because of the Sherman-Morrison-Woodbury formula, we can use the update to $B$ to derive an update for $H$ :

$\begin{aligned} H^+ = H + \dfrac{(s - Hy)(s - Hy)^\top}{(s - Hy)^\top y}. \end{aligned}$

This is known as the SR1 update. SR1 is simple to understand and implement, but has two shortcomings. First, the denominator in the update for $B$ , $(y - Bs)^\top s$ , might be approximately zero, causing numerical issues. Second, it does not preserve positive definiteness.

References:

Peña, J. (2016). Quasi-Newton Methods. (CMU Convex Optimization: Fall 2016, Lecture on Nov 2.)

Statistical Odds & Ends

Quasi-Newton methods: SR1

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply