Do you want to publish a course? Click here

From Nesterovs Estimate Sequence to Riemannian Acceleration

86   0   0.0 ( 0 )
 Added by Suvrit Sra
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

We propose the first global accelerated gradient method for Riemannian manifolds. Toward establishing our result we revisit Nesterovs estimate sequence technique and develop an alternative analysis for it that may also be of independent interest. Then, we extend this analysis to the Riemannian setting, localizing the key difficulty due to non-Euclidean structure into a certain ``metric distortion. We control this distortion by developing a novel geometric inequality, which permits us to propose and analyze a Riemannian counterpart to Nesterovs accelerated gradient method.



rate research

Read More

We further research on the acceleration phenomenon on Riemannian manifolds by introducing the first global first-order method that achieves the same rates as accelerated gradient descent in the Euclidean space for the optimization of smooth and geodesically convex (g-convex) or strongly g-convex functions defined on the hyperbolic space or a subset of the sphere, up to constants and log factors. To the best of our knowledge, this is the first method that is proved to achieve these rates globally on functions defined on a Riemannian manifold $mathcal{M}$ other than the Euclidean space. As a proxy, we solve a constrained non-convex Euclidean problem, under a condition between convexity and quasar-convexity, of independent interest. Additionally, for any Riemannian manifold of bounded sectional curvature, we provide reductions from optimization methods for smooth and g-convex functions to methods for smooth and strongly g-convex functions and vice versa.
We propose a novel second-order ODE as the continuous-time limit of a Riemannian accelerated gradient-based method on a manifold with curvature bounded from below. This ODE can be seen as a generalization of the ODE derived for Euclidean spaces, and can also serve as an analysis tool. We study the convergence behavior of this ODE for different classes of functions, such as geodesically convex, strongly-convex and weakly-quasi-convex. We demonstrate how such an ODE can be discretized using a semi-implicit and Nesterov-inspired numerical integrator, that empirically yields stable algorithms which are faithful to the continuous-time analysis and exhibit accelerated convergence.
137 - Tao Hong , Irad Yavneh 2021
Nesterovs well-known scheme for accelerating gradient descent in convex optimization problems is adapted to accelerating stationary iterative solvers for linear systems. Compared with classical Krylov subspace acceleration methods, the proposed scheme requires more iterations, but it is trivial to implement and retains essentially the same computational cost as the unaccelerated method. An explicit formula for a fixed optimal parameter is derived in the case where the stationary iteration matrix has only real eigenvalues, based only on the smallest and largest eigenvalues. The fixed parameter, and corresponding convergence factor, are shown to maintain their optimality when the iteration matrix also has complex eigenvalues that are contained within an explicitly defined disk in the complex plane. A comparison to Chebyshev acceleration based on the same information of the smallest and largest real eigenvalues (dubbed Restricted Information Chebyshev acceleration) demonstrates that Nesterovs scheme is more robust in the sense that it remains optimal over a larger domain when the iteration matrix does have some complex eigenvalues. Numerical tests validate the efficiency of the proposed scheme. This work generalizes and extends the results of [1, Lemmas 3.1 and 3.2 and Theorem 3.3].
We study the convergence issue for the gradient algorithm (employing general step sizes) for optimization problems on general Riemannian manifolds (without curvature constraints). Under the assumption of the local convexity/quasi-convexity (resp. weak sharp minima), local/global convergence (resp. linear convergence) results are established. As an application, the linear convergence properties of the gradient algorithm employing the constant step sizes and the Armijo step sizes for finding the Riemannian $L^p$ ($pin[1,+infty)$) centers of mass are explored, respectively, which in particular extend and/or improve the corresponding results in cite{Afsari2013}.
This paper considers a class of constrained convex stochastic composite optimization problems whose objective function is given by the summation of a differentiable convex component, together with a nonsmooth but convex component. The nonsmooth component has an explicit max structure that may not easy to compute its proximal mapping. In order to solve these problems, we propose a mini-batch stochastic Nesterovs smoothing (MSNS) method. Convergence and the optimal iteration complexity of the method are established. Numerical results are provided to illustrate the efficiency of the proposed MSNS method for a support vector machine (SVM) model.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا