# What is Moreau’s Decomposition Theorem?

Let $f: \mathbb{R}^n \mapsto \mathbb{R}$ be a convex function. Moreau’s Decomposition Theorem is the following result:

Theorem (Moreau Decomposition). For all $x \in \mathbb{R}^n$,

$x = \textbf{prox}_{f} (x) + \textbf{prox}_{f^*} (x)$,

where $\textbf{prox}_{f}$ is the proximal operator for $f$ and $f^*$ is the convex conjugate of $f$.

Here is the proof: Let $z = \textbf{prox}_{f} (x)$. Then,

\begin{aligned} z = \textbf{prox}_{f} (x) \quad &\Leftrightarrow \quad x - z \in \partial f(z) \\ &\Leftrightarrow \quad z \in \partial f^*(x-z) \\ &\Leftrightarrow \quad x - (x-z) \in \partial f^*(x-z) \\ &\Leftrightarrow \quad x - z = \textbf{prox}_{f^*} (x) \\ &\Leftrightarrow \quad x = \textbf{prox}_{f} (x) + \textbf{prox}_{f^*} (x). \end{aligned}

The second equivalence is a result involving convex conjugates and subdifferentials (see this post for statement and proof) while the 4th equivalence is a property of the proximal operator (see the note at the end of this post).

Extended Moreau Decomposition

The extended version of Moreau’s Decomposition Theorem involves a scaling factor $\lambda > 0$. It states that for all $x$,

\begin{aligned} x = \textbf{prox}_{\lambda f} (x) + \textbf{prox}_{\lambda^{-1} f^*} (x / \lambda). \end{aligned}

Proof: Applying Moreau decomposition to the function $\lambda f$ gives

$x = \textbf{prox}_{\lambda f} (x) + \textbf{prox}_{(\lambda f)^*} (x).$

Using the definitions of the proximal operator and the convex conjugate,

\begin{aligned} \textbf{prox}_{(\lambda f)^*} (x) &= \underset{u}{\text{argmin}} \left\{ \frac{1}{2} \| x - u \|_2^2 + (\lambda f)^* (u) \right\} \\ &= \underset{u}{\text{argmin}} \left\{ \frac{1}{2} \| x - u \|_2^2 + \sup_v \left[ u^\top v - \lambda f(v) \right] \right\} \\ &= \underset{u}{\text{argmin}} \left\{ \frac{1}{2} \|(x/\lambda) - (u / \lambda) \|_2^2 + \frac{1}{\lambda} \sup_v \left[ (u / \lambda)^\top v - f(v) \right] \right\} \\ &= \underset{u}{\text{argmin}} \left\{ \frac{1}{2} \|(x/\lambda) - (u / \lambda) \|_2^2 + \frac{1}{\lambda} f^*( u / \lambda) \right\} \\ &= \underset{u}{\text{argmin}} \left\{ \frac{1}{2} \| (x / \lambda) - u \|_2^2 + \lambda^{-1}f^* (u) \right\} \\ &= \textbf{prox}_{\lambda^{-1} f^*} (x / \lambda). \end{aligned}

References:

1. Gu, Q. (2016). “SYS 6003: Optimization. Lecture 25.”

# Convex conjugate functions, Fenchel’s inequality, subdifferentials and optimality

Convex conjugate functions and Fenchel’s inequality

Let $f: \mathbb{R}^n \mapsto \mathbb{R}$ be some function. The convex conjugate of $f$ (also know as the Fenchel transformation), is the function $f^*: \mathbb{R}^n \mapsto \mathbb{R}$ defined as

$f^*(y) = \underset{x}{\sup} \left\{ y^\top x - f(x) \right\}.$

Fenchel’s equality is the following statement:

Theorem (Fenchel’s inequality). For any $x, y \in \mathbb{R}^n$,

$f(x) + f^*(y) \geq x^\top y.$

The proof of Fenchel’s equality follows directly from the definition of the convex conjugate:

\begin{aligned} f(x) + f^*(y) &= f(x) + \underset{u}{\sup} \left\{ y^\top u - f(u) \right\} \\ &\geq f(x) + y^\top x - f(x) \\ &= x^\top y. \end{aligned}

A direct application of Fenchel’s inequality shows that the conjugate of the conjugate (the biconjugate), denoted $f^{**}$, always satisfies

$f^{**} \leq f.$

To see this: Fenchel’s inequality says $f(x) \geq x^\top y - f^*(y)$ for all $x, y$. Taking a supremum over all $y$, the inequality becomes

$f(x) \geq \underset{y}{\sup} \left\{ x^\top y - f^*(y) \right\} = f^{**}(x).$

It turns out that if $f$ is closed and convex, then we actually have $f^{**} = f$. The proof is a bit cumbersome and so I’m not providing it in this post. (For a proof, see Reference 1 or Slides 5.13-5.14 of Reference 2.) More generally, the Fenchel-Moreau theorem gives necessary and sufficient conditions for $f^{**} = f$.

Subdifferentials and optimality

A vector $g \in \mathbb{R}^n$ is a subgradient of $f$ at point $x \in \mathbb{R}^n$ if

$f(y) \geq f(x) + g^\top (y - x)$ for all $y \in \mathbb{R}^n$.

The subdifferential of $f$ at point $x$, denoted $\partial f(x)$, is the set of all the subgradients of $f$ and point $x$.

Convex conjugates appear in the following theorem concerning subdifferentials:

Theorem. If $f$ is closed and convex, then for any $x, y \in \mathbb{R}^n$,

\begin{aligned} y \in \partial f(x) \; \Leftrightarrow \; x \in \partial f^*(y) \; \Leftrightarrow \; f(x) + f^*(y) = x^\top y. \end{aligned}

Proof (adapted from Slide 5.15 of Reference 2 and Reference 3): First we show that $y \in \partial f(x) \; \Rightarrow \; x \in \partial f^*(y)$. Using the definitions of subgradients and convex conjugates,

\begin{aligned} y \in \partial f(x) &\Rightarrow \; f(u) \geq f(x) + y^\top (u - x) \quad \text{for all } u \\ &\Rightarrow \; y^\top u - f(u) \leq y^\top x - f(x) \quad \text{for all } u \\ &\Rightarrow f^*(y) = y^\top x - f(x). \end{aligned}

Thus, for any $v$,

\begin{aligned} f^*(v) &= \underset{u}{\sup} \left\{ v^\top u - f(u) \right\} \\ &\geq v^\top x - f(x) \\ &= y^\top x - f(x) + x^\top (v - y) \\ &= f^*(y) + x^\top (v - y), \end{aligned}

which implies that $x \in \partial f^*(y)$.

To get the reverse implication $\partial x \in f^*(y) \; \Rightarrow \; y \in \partial f(x)$, apply the same logic above with $f^*$ in place of $f$ and use the fact that $f^{**} = f$ for closed convex $f$.

Next, we show that $y \in \partial f(x) \; \Rightarrow \; f(x) + f^*(y) = x^\top y$. Using the definitions of subgradients,

\begin{aligned} y \in \partial f(x) &\Rightarrow \; f(u) \geq f(x) + y^\top (u - x) \quad \text{for all } u \\ &\Rightarrow \; x^\top y \geq f(x) + (y^\top u - f(u)) \quad \text{for all } u. \end{aligned}

Taking supremum over all $u$, the inequality becomes $x^\top y \geq f(x) + f^*(y)$. Combining this with Fenchel’s inequality gives us equality $x^\top y = f(x) + f^*(y)$.

Conversely, if $x^\top y = f(x) + f^*(y)$, we have

\begin{aligned} x^\top y &\geq f(x) + f^*(y), \\ x^\top y &\geq f(x) + y^\top u - f(u) \quad \text{for all } u, \\ f(u) &\geq f(x) + y^\top (u - x) \quad \text{for all } u, \\ y &\in \partial f(x). \end{aligned}

References:

1. StackExchange. “How to prove the conjugate of the conjugate function is itself?”
2. Vandenberghe, L. (2022). “5. Conjugate functions.”