ترغب بنشر مسار تعليمي؟ اضغط هنا

Accelerated Algorithms for Convex and Non-Convex Optimization on Manifolds

68   0   0.0 ( 0 )
 نشر من قبل Michael Minyi Zhang
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

We propose a general scheme for solving convex and non-convex optimization problems on manifolds. The central idea is that, by adding a multiple of the squared retraction distance to the objective function in question, we convexify the objective function and solve a series of convex sub-problems in the optimization procedure. One of the key challenges for optimization on manifolds is the difficulty of verifying the complexity of the objective function, e.g., whether the objective function is convex or non-convex, and the degree of non-convexity. Our proposed algorithm adapts to the level of complexity in the objective function. We show that when the objective function is convex, the algorithm provably converges to the optimum and leads to accelerated convergence. When the objective function is non-convex, the algorithm will converge to a stationary point. Our proposed method unifies insights from Nesterovs original idea for accelerating gradient descent algorithms with recent developments in optimization algorithms in Euclidean space. We demonstrate the utility of our algorithms on several manifold optimization tasks such as estimating intrinsic and extrinsic Frechet means on spheres and low-rank matrix factorization with Grassmann manifolds applied to the Netflix rating data set.



قيم البحث

اقرأ أيضاً

While recent work suggests that quantum computers can speed up the solution of semidefinite programs, little is known about the quantum complexity of more general convex optimization. We present a quantum algorithm that can optimize a convex function over an $n$-dimensional convex body using $tilde{O}(n)$ queries to oracles that evaluate the objective function and determine membership in the convex body. This represents a quadratic improvement over the best-known classical algorithm. We also study limitations on the power of quantum computers for general convex optimization, showing that it requires $tilde{Omega}(sqrt n)$ evaluation queries and $Omega(sqrt{n})$ membership queries.
This paper investigates accelerating the convergence of distributed optimization algorithms on non-convex problems. We propose a distributed primal-dual stochastic gradient descent~(SGD) equipped with powerball method to accelerate. We show that the proposed algorithm achieves the linear speedup convergence rate $mathcal{O}(1/sqrt{nT})$ for general smooth (possibly non-convex) cost functions. We demonstrate the efficiency of the algorithm through numerical experiments by training two-layer fully connected neural networks and convolutional neural networks on the MNIST dataset to compare with state-of-the-art distributed SGD algorithms and centralized SGD algorithms.
We propose an accelerated meta-algorithm, which allows to obtain accelerated methods for convex unconstrained minimization in different settings. As an application of the general scheme we propose nearly optimal methods for minimizing smooth function s with Lipschitz derivatives of an arbitrary order, as well as for smooth minimax optimization problems. The proposed meta-algorithm is more general than the ones in the literature and allows to obtain better convergence rates and practical performance in several settings.
94 - Hao Luo 2021
This work introduces a second-order differential inclusion for unconstrained convex optimization. In continuous level, solution existence in proper sense is obtained and exponential decay of a novel Lyapunov function along with the solution trajector y is derived as well. Then in discrete level, based on numerical discretizations of the continuous differential inclusion, both an inexact accelerated proximal point algorithm and an inexact accelerated proximal gradient method are proposed, and some new convergence rates are established via a discrete Lyapunov function.
We study the robustness of accelerated first-order algorithms to stochastic uncertainties in gradient evaluation. Specifically, for unconstrained, smooth, strongly convex optimization problems, we examine the mean-squared error in the optimization va riable when the iterates are perturbed by additive white noise. This type of uncertainty may arise in situations where an approximation of the gradient is sought through measurements of a real system or in a distributed computation over a network. Even though the underlying dynamics of first-order algorithms for this class of problems are nonlinear, we establish upper bounds on the mean-squared deviation from the optimal solution that are tight up to constant factors. Our analysis quantifies fundamental trade-offs between noise amplification and convergence rates obtained via any acceleration scheme similar to Nesterovs or heavy-ball methods. To gain additional analytical insight, for strongly convex quadratic problems, we explicitly evaluate the steady-state variance of the optimization variable in terms of the eigenvalues of the Hessian of the objective function. We demonstrate that the entire spectrum of the Hessian, rather than just the extreme eigenvalues, influence robustness of noisy algorithms. We specialize this result to the problem of distributed averaging over undirected networks and examine the role of network size and topology on the robustness of noisy accelerated algorithms.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا