ترغب بنشر مسار تعليمي؟ اضغط هنا

Global Riemannian Acceleration in Hyperbolic and Spherical Spaces

96   0   0.0 ( 0 )
 نشر من قبل David Mart\\'inez-Rubio
 تاريخ النشر 2020
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

We further research on the acceleration phenomenon on Riemannian manifolds by introducing the first global first-order method that achieves the same rates as accelerated gradient descent in the Euclidean space for the optimization of smooth and geodesically convex (g-convex) or strongly g-convex functions defined on the hyperbolic space or a subset of the sphere, up to constants and log factors. To the best of our knowledge, this is the first method that is proved to achieve these rates globally on functions defined on a Riemannian manifold $mathcal{M}$ other than the Euclidean space. As a proxy, we solve a constrained non-convex Euclidean problem, under a condition between convexity and quasar-convexity, of independent interest. Additionally, for any Riemannian manifold of bounded sectional curvature, we provide reductions from optimization methods for smooth and g-convex functions to methods for smooth and strongly g-convex functions and vice versa.

قيم البحث

اقرأ أيضاً

85 - Kwangjun Ahn , Suvrit Sra 2020
We propose the first global accelerated gradient method for Riemannian manifolds. Toward establishing our result we revisit Nesterovs estimate sequence technique and develop an alternative analysis for it that may also be of independent interest. The n, we extend this analysis to the Riemannian setting, localizing the key difficulty due to non-Euclidean structure into a certain ``metric distortion. We control this distortion by developing a novel geometric inequality, which permits us to propose and analyze a Riemannian counterpart to Nesterovs accelerated gradient method.
We propose a novel second-order ODE as the continuous-time limit of a Riemannian accelerated gradient-based method on a manifold with curvature bounded from below. This ODE can be seen as a generalization of the ODE derived for Euclidean spaces, and can also serve as an analysis tool. We study the convergence behavior of this ODE for different classes of functions, such as geodesically convex, strongly-convex and weakly-quasi-convex. We demonstrate how such an ODE can be discretized using a semi-implicit and Nesterov-inspired numerical integrator, that empirically yields stable algorithms which are faithful to the continuous-time analysis and exhibit accelerated convergence.
We study the convergence of gradient flows related to learning deep linear neural networks (where the activation function is the identity map) from data. In this case, the composition of the network layers amounts to simply multiplying the weight mat rices of all layers together, resulting in an overparameterized problem. The gradient flow with respect to these factors can be re-interpreted as a Riemannian gradient flow on the manifold of rank-$r$ matrices endowed with a suitable Riemannian metric. We show that the flow always converges to a critical point of the underlying functional. Moreover, we establish that, for almost all initializations, the flow converges to a global minimum on the manifold of rank $k$ matrices for some $kleq r$.
113 - Toshiyuki Sugawa 2017
Let $Omega$ be a domain in $mathbb{C}$ with hyperbolic metric $lambda_Omega(z)|dz|$ of Gaussian curvature $-4.$ Mejia and Minda proved in their 1990 paper that $Omega$ is (Euclidean) convex if and only if $d(z,partialOmega)lambda_Omega(z)ge1/2$ for $ zinOmega,$ where $d(z,partialOmega)$ denotes the Euclidean distance from $z$ to the boundary $partialOmega.$ In the present note, we will provide similar characterizations of spherically convex domains in terms of the spherical density of the hyperbolic metric.
The paper proves convergence to global optima for a class of distributed algorithms for nonconvex optimization in network-based multi-agent settings. Agents are permitted to communicate over a time-varying undirected graph. Each agent is assumed to p ossess a local objective function (assumed to be smooth, but possibly nonconvex). The paper considers algorithms for optimizing the sum function. A distributed algorithm of the consensus+innovations type is proposed which relies on first-order information at the agent level. Under appropriate conditions on network connectivity and the cost objective, convergence to the set of global optima is achieved by an annealing-type approach, with decaying Gaussian noise independently added into each agents update step. It is shown that the proposed algorithm converges in probability to the set of global minima of the sum function.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا