**BFGS**

In this previous post, we described how * Quasi-Newton methods* can be used to minimize a twice-differentiable function whose domain is all of .

*is a popular quasi-Newton method. At each iteration, we take the following steps:*

**BFGS**- Solve for in .
- Update with .
- Update according to the equation

where and .

**Limited memory BFGS (L-BFGS)**

The issue with BFGS is that we have to maintain the matrix , which is too costly when is large. The main idea behind * limited memory BFGS (L-BFGS)* is that we don’t really need : what we really want is . If we can compute implicitly by maintaining a few vectors, then we never have to deal with an matrix.

Let be any vector. We have

Thus, if we had an efficient way to compute , we have an efficient way to compute involving just vectors:

- Compute .
- Compute .
- Compute efficiently.
- Compute .
- Compute .

But we do have an efficient way to computing : simply replace with and with above! By recursively unfolding Step 3, we get the quasi-Newton update for iteration (computation of ):

- .
- For , set , and .
- .
- For , set , and .
- Return .

Limited memory BFGS goes one step further: instead of doing 2 loops of length for the th iteration, do 2 loops of length instead, where is usually set to 5 or 10. Here is the appropriately modified update:

- .
- For , set , and .
- .
- For , set , and .
- Return .

A popular choice for is .

References:

- Peña, J. (2016). Quasi-Newton Methods. (CMU Convex Optimization: Fall 2016, Lecture on Nov 2.)