Convex conjugate functions, Fenchel’s inequality, subdifferentials and optimality

Convex conjugate functions and Fenchel’s inequality

Let f: \mathbb{R}^n \mapsto \mathbb{R} be some function. The convex conjugate of f (also know as the Fenchel transformation), is the function f^*: \mathbb{R}^n \mapsto \mathbb{R} defined as

f^*(y) = \underset{x}{\sup} \left\{ y^\top x - f(x) \right\}.

Fenchel’s equality is the following statement:

Theorem (Fenchel’s inequality). For any x, y \in \mathbb{R}^n,

f(x) + f^*(y) \geq x^\top y.

The proof of Fenchel’s equality follows directly from the definition of the convex conjugate:

\begin{aligned} f(x) + f^*(y) &= f(x) + \underset{u}{\sup} \left\{ y^\top u - f(u) \right\}  \\  &\geq f(x) + y^\top x - f(x) \\  &= x^\top y. \end{aligned}

A direct application of Fenchel’s inequality shows that the conjugate of the conjugate (the biconjugate), denoted f^{**}, always satisfies

f^{**} \leq f.

To see this: Fenchel’s inequality says f(x) \geq x^\top y - f^*(y) for all x, y. Taking a supremum over all y, the inequality becomes

f(x) \geq \underset{y}{\sup} \left\{ x^\top y - f^*(y) \right\} = f^{**}(x).

It turns out that if f is closed and convex, then we actually have f^{**} = f. The proof is a bit cumbersome and so I’m not providing it in this post. (For a proof, see Reference 1 or Slides 5.13-5.14 of Reference 2.) More generally, the Fenchel-Moreau theorem gives necessary and sufficient conditions for f^{**} = f.

Subdifferentials and optimality

A vector g \in \mathbb{R}^n is a subgradient of f at point x \in \mathbb{R}^n if

f(y) \geq f(x) + g^\top (y - x) for all y \in \mathbb{R}^n.

The subdifferential of f at point x, denoted \partial f(x), is the set of all the subgradients of f and point x.

Convex conjugates appear in the following theorem concerning subdifferentials:

Theorem. If f is closed and convex, then for any x, y \in \mathbb{R}^n,

\begin{aligned} y \in \partial f(x) \; \Leftrightarrow \; x \in \partial f^*(y) \; \Leftrightarrow \; f(x) + f^*(y) = x^\top y. \end{aligned}

Proof (adapted from Slide 5.15 of Reference 2 and Reference 3): First we show that y \in \partial f(x) \; \Rightarrow \; x \in \partial f^*(y). Using the definitions of subgradients and convex conjugates,

\begin{aligned} y \in \partial f(x) &\Rightarrow \; f(u) \geq f(x) + y^\top (u - x) \quad \text{for all } u \\  &\Rightarrow \; y^\top u - f(u) \leq y^\top x - f(x) \quad \text{for all } u \\  &\Rightarrow f^*(y) = y^\top x - f(x). \end{aligned}

Thus, for any v,

\begin{aligned} f^*(v) &= \underset{u}{\sup} \left\{ v^\top u - f(u) \right\} \\  &\geq v^\top x - f(x) \\  &= y^\top x - f(x) + x^\top (v - y) \\  &= f^*(y) + x^\top (v - y), \end{aligned}

which implies that x \in \partial f^*(y).

To get the reverse implication \partial x \in f^*(y) \; \Rightarrow \; y \in \partial f(x), apply the same logic above with f^* in place of f and use the fact that f^{**} = f for closed convex f.

Next, we show that y \in \partial f(x) \; \Rightarrow \; f(x) + f^*(y) = x^\top y. Using the definitions of subgradients,

\begin{aligned} y \in \partial f(x) &\Rightarrow \; f(u) \geq f(x) + y^\top (u - x) \quad \text{for all } u \\  &\Rightarrow \; x^\top y \geq f(x) + (y^\top u - f(u)) \quad \text{for all } u. \end{aligned}

Taking supremum over all u, the inequality becomes x^\top y \geq f(x) + f^*(y). Combining this with Fenchel’s inequality gives us equality x^\top y = f(x) + f^*(y).

Conversely, if x^\top y = f(x) + f^*(y), we have

\begin{aligned} x^\top y &\geq f(x) + f^*(y), \\  x^\top y &\geq f(x) + y^\top u - f(u) \quad \text{for all } u, \\  f(u) &\geq f(x) + y^\top (u - x) \quad \text{for all } u, \\  y &\in \partial f(x). \end{aligned}


  1. StackExchange. “How to prove the conjugate of the conjugate function is itself?”
  2. Vandenberghe, L. (2022). “5. Conjugate functions.”
  3. StackExchange. “Proof about Conjugate and subgradient.”