No Arabic abstract
Frank-Wolfe methods are popular for optimization over a polytope. One of the reasons is because they do not need projection onto the polytope but only linear optimization over it. To understand its complexity, Lacoste-Julien and Jaggi introduced a condition number for polytopes and showed linear convergence for several variations of the method. The actual running time can still be exponential in the worst case (when the condition number is exponential). We study the smoothed complexity of the condition number, namely the condition number of small random perturbations of the input polytope and show that it is polynomial for any simplex and exponential for general polytopes. Our results also apply to other condition measures of polytopes that have been proposed for the analysis of Frank-Wolfe methods: vertex-facet distance (Beck and Shtern) and facial distance (Pe~na and Rodriguez). Our argument for polytopes is a refinement of an argument that we develop to study the conditioning of random matrices. The basic argument shows that for $c>1$ a $d$-by-$n$ random Gaussian matrix with $n geq cd$ has a $d$-by-$d$ submatrix with minimum singular value that is exponentially small with high probability. This has consequences on results about the robust uniqueness of tensor decompositions.
We study projection-free methods for constrained Riemannian optimization. In particular, we propose the Riemannian Frank-Wolfe (RFW) method. We analyze non-asymptotic convergence rates of RFW to an optimum for (geodesically) convex problems, and to a critical point for nonconvex objectives. We also present a practical setting under which RFW can attain a linear convergence rate. As a concrete example, we specialize Rfw to the manifold of positive definite matrices and apply it to two tasks: (i) computing the matrix geometric mean (Riemannian centroid); and (ii) computing the Bures-Wasserstein barycenter. Both tasks involve geodesically convex interval constraints, for which we show that the Riemannian linear oracle required by RFW admits a closed-form solution; this result may be of independent interest. We further specialize RFW to the special orthogonal group and show that here too, the Riemannian linear oracle can be solved in closed form. Here, we describe an application to the synchronization of data matrices (Procrustes problem). We complement our theoretical results with an empirical comparison of Rfw against state-of-the-art Riemannian optimization methods and observe that RFW performs competitively on the task of computing Riemannian centroids.
We propose a variant of the Frank-Wolfe algorithm for solving a class of sparse/low-rank optimization problems. Our formulation includes Elastic Net, regularized SVMs and phase retrieval as special cases. The proposed Primal-Dual Block Frank-Wolfe algorithm reduces the per-iteration cost while maintaining linear convergence rate. The per iteration cost of our method depends on the structural complexity of the solution (i.e. sparsity/low-rank) instead of the ambient dimension. We empirically show that our algorithm outperforms the state-of-the-art methods on (multi-class) classification tasks.
A question related to some conjectures of Lutwak about the affine quermassintegrals of a convex body $K$ in ${mathbb R}^n$ asks whether for every convex body $K$ in ${mathbb R}^n$ and all $1leqslant kleqslant n$ $$Phi_{[k]}(K):={rm vol}_n(K)^{-frac{1}{n}}left (int_{G_{n,k}}{rm vol}_k(P_F(K))^{-n},d u_{n,k}(F)right )^{-frac{1}{kn}}leqslant csqrt{n/k},$$ where $c>0$ is an absolute constant. We provide an affirmative answer for some broad classes of random polytopes. We also discuss upper bounds for $Phi_{[k]}(K)$ when $K=B_1^n$, the unit ball of $ell_1^n$, and explain how this special instance has implications for the case of a general unconditional convex body $K$.
We show that the smoothed complexity of the FLIP algorithm for local Max-Cut is at most $smash{phi n^{O(sqrt{log n})}}$, where $n$ is the number of nodes in the graph and $phi$ is a parameter that measures the magnitude of perturbations applied on its edge weights. This improves the previously best upper bound of $phi n^{O(log n)}$ by Etscheid and R{o}glin. Our result is based on an analysis of long sequences of flips, which shows~that~it is very unlikely for every flip in a long sequence to incur a positive but small improvement in the cut weight. We also extend the same upper bound on the smoothed complexity of FLIP to all binary Maximum Constraint Satisfaction Problems.
A two-step model for generating random polytopes is considered. For parameters $d$, $m$, and $p$, the first step is to generate a simple polytope $P$ whose facets are given by $m$ uniform random hyperplanes tangent to the unit sphere in $mathbb{R}^d$, and the second step is to sample each vertex of $P$ independently with probability $p$ and let $Q$ be the convex hull of the sampled vertices. We establish results on how well $Q$ approximates the unit sphere in terms of $m$ and $p$ as well as asymptotics on the combinatorial complexity of $Q$ for certain regimes of $p$.