ﻻ يوجد ملخص باللغة العربية
We study the impact of the constraint set and gradient geometry on the convergence of online and stochastic methods for convex optimization, providing a characterization of the geometries for which stochastic gradient and adaptive gradient methods are (minimax) optimal. In particular, we show that when the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient methods are minimax optimal. We further provide a converse that shows that when the constraints are not quadratically convex---for example, any $ell_p$-ball for $p < 2$---the methods are far from optimal. Based on this, we can provide concrete recommendations for when one should use adaptive, mirror or stochastic gradient methods.
Convergence of the gradient descent algorithm has been attracting renewed interest due to its utility in deep learning applications. Even as multiple variants of gradient descent were proposed, the assumption that the gradient of the objective is Lip
Minimizing the rank of a matrix subject to constraints is a challenging problem that arises in many applications in control theory, machine learning, and discrete geometry. This class of optimization problems, known as rank minimization, is NP-HARD,
The present work establishes necessary and sufficient conditions for a nonlinear system with two inputs to be described by a specific triangular form. Except for some regularity conditions, such triangular form is flat. This may lead to the discovery
The (global) Lipschitz smoothness condition is crucial in establishing the convergence theory for most optimization methods. Unfortunately, most machine learning and signal processing problems are not Lipschitz smooth. This motivates us to generalize
We develop a new primitive for stochastic optimization: a low-bias, low-cost estimator of the minimizer $x_star$ of any Lipschitz strongly-convex function. In particular, we use a multilevel Monte-Carlo approach due to Blanchet and Glynn to turn any