Optimality condition for Euclidean projection onto a convex polytope

Let P = \{ p_1, \dots, p_m \} be a set of points in \mathbb{R}^n, and let C(P) denote the convex hull of P. We are often interested in finding the point in C(P) that is nearest to some reference point \hat{x}, known as the Euclidean projection onto C(P). Let’s denote this point by \text{proj}_{C(P)}(\hat{x}).

Optimality condition for \text{proj}_{C(P)}(\hat{x}) when \hat{x} = 0

Wolfe (1976) (Reference 1) provides an optimality condition for \text{proj}_{C(P)}(\hat{x}) when the reference point is the origin, i.e. \hat{x} = 0.

Theorem. Let x be a point in C(P). x = \text{proj}_{C(P)}(0) if and only if

\begin{aligned} x^\top p_j \geq \| x \|^2 \quad \text{for all } j = 1, \dots, m. \end{aligned}

Geometric intuition

Rearranging the condition above gives us x^\top (p_j - x) \geq 0. Fixing x, note that x^\top (y - x) = 0 represents the hyperplane passing through x which is orthogonal to the line segment going from 0 to x. Hence, the condition x^\top (p_j - x) \geq 0 means that the directed line segment from 0 to x and the directed line segment from x to p_j make an angle that is \leq 90^\circ. Geometrically, this means that p_j and 0 lie on different sides of the hyperplane x^\top (y - x) = 0, and so x is nearer to 0 than p_j.

The diagram below demonstrates this in two dimensions. In both figures, the candidate for being the nearest point, x, is X = P_1. P_1 is the nearest point to the origin in the left figure but not the right figure. The angles that the optimality condition looks at are the angles between the black arrow and the colored arrows: they are also marked by little arcs. We see that in the left figure, both the angles are \leq 90^\circ. In the right figure, the yellow angle is > 90^\circ. It corresponds to a vertex of the polytope that is on the same side of the hyperplane (dotted line) as the origin, and hence some point on the line XP_3 is going to be closer to the origin than X.

Optimality condition for any \hat{x}

Let P - \hat{x} = \{ p_1 - \hat{x}, \dots, p_m - \hat{x} \}. Projecting \hat{x} onto C(P) is equivalent to projecting 0 onto C(P - \hat{x}) and then adding \hat{x} to it:

\begin{aligned} \text{proj}_{C(P)}(\hat{x}) &= \hat{x} + \text{proj}_{C(P - \hat{x})}(0), \\  \text{proj}_{C(P - \hat{x})}(0) &= \text{proj}_{C(P)}(\hat{x}) - \hat{x}. \end{aligned}

Hence, a point x \in C(P) is \text{proj}_{C(P)}(\hat{x}) if and only if

\begin{aligned} (x - \hat{x})^\top (p_j - \hat{x}) &\geq \|x - \hat{x}\|^2 \quad \text{for all } j = 1, \dots, m, \\  (x - \hat{x})^\top p_j &\geq (x - \hat{x})^\top x \quad \text{for all } j = 1, \dots, m. \end{aligned}

(Note: I learned of the iff condition for general \hat{x} from Reference 2.)

References:

  1. Wolfe, P. (1976). Finding the nearest point in a polytope.
  2. Ramanath, R. (2022). Efficient vertex-oriented polytopic projection for web-scale applications.
Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s