# O(n log n) algorithm for Euclidean projection onto a simplex

Duchi et. al. (2008) (Reference 1) has a simple and efficient algorithm for projecting a point $v \in \mathbb{R}^n$ onto the simplex. Formally, the optimization problem is \begin{aligned} \text{minimize}_{w \in \mathbb{R}^n} \quad& \| w - v \|^2 \\ \text{subject to} \quad& \sum_{i=1}^n w_i = z, \\ & w_i \geq 0 \text{ for all } i, \end{aligned}

where $\| \cdot \|$ represents the Euclidean norm and $z > 0$ is some constant. Their algorithm runs in $O(n \log n)$ time, with the most expensive step being sorting the elements of $v$ in descending order. Here is the algorithm:

1. Sort the elements of $v$ in descending order: $v_{(1)} \geq \dots \geq v_{(n)}$.
2. Find $\rho = \max \left\{ j: v_{(j)} - \frac{1}{j} \left( \sum_{r=1}^j v_{(r)} - z \right) > 0\right\}$.
3. Define $\theta = \frac{1}{\rho} \left( \sum_{i=1}^\rho v_{(i)} - z \right)$.
4. Output $w \in \mathbb{R}^n$ with $w_i = \max \{ v_i - \theta, 0 \}$.

Note: In some cases, we know that the projection is very likely to be a vertex of the simplex (for example, if the point is very far away from the simplex). For these cases, Ramanath et. al. (2021) (Reference 2, Algorithm 2) have a modified version of the algorithm above (based on Wolfe’s algorithm) that almost always runs in $O(n)$ time. For large $n$, dropping the logarithmic factor can make a significant difference.

References:

1. Duchi, J., et. al. (2008). Efficient projections onto the l1-ball for learning in high dimensions.
2. Ramanath, R., et. al. (2021). Efficient Vertex-Oriented Polytopic Projection for Web-scale Applications.