Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

On the Convergence of Asynchronous Parallel Iteration with Unbounded Delays

122 0 0.0 ( 0 )

Download Cite

Added by Yangyang Xu

Publication date 2016

fields Informatics Engineering

and research's language is English

Authors Zhimin Peng - Yangyang Xu - Ming Yan

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Recent years have witnessed the surge of asynchronous parallel (async-parallel) iterative algorithms due to problems involving very large-scale data and a large number of decision variables. Because of asynchrony, the iterates are computed with outdated information, and the age of the outdated information, which we call delay, is the number of times it has been updated since its creation. Almost all recent works prove convergence under the assumption of a finite maximum delay and set their stepsize parameters accordingly. However, the maximum delay is practically unknown. This paper presents convergence analysis of an async-parallel method from a probabilistic viewpoint, and it allows for large unbounded delays. An explicit formula of stepsize that guarantees convergence is given depending on delays statistics. With $p+1$ identical processors, we empirically measured that delays closely follow the Poisson distribution with parameter $p$, matching our theoretical model, and thus the stepsize can be set accordingly. Simulations on both convex and nonconvex optimization problems demonstrate the validness of our analysis and also show that the existing maximum-delay induced stepsize is too conservative, often slowing down the convergence of the algorithm.

rate research

Asynchronous Distributed Optimization with Stochastic Delays

162 - Margalit Glasgow , Mary Wootters 2020

We study asynchronous finite sum minimization in a distributed-data setting with a central parameter server. While asynchrony is well understood in parallel settings where the data is accessible by all machines -- e.g., modifications of variance-reduced gradient algorithms like SAGA work well -- little is known for the distributed-data setting. We develop an algorithm ADSAGA based on SAGA for the distributed-data setting, in which the data is partitioned between many machines. We show that with $m$ machines, under a natural stochastic delay model with an mean delay of $m$, ADSAGA converges in $tilde{O}left(left(n + sqrt{m}kapparight)log(1/epsilon)right)$ iterations, where $n$ is the number of component functions, and $kappa$ is a condition number. This complexity sits squarely between the complexity $tilde{O}left(left(n + kapparight)log(1/epsilon)right)$ of SAGA textit{without delays} and the complexity $tilde{O}left(left(n + mkapparight)log(1/epsilon)right)$ of parallel asynchronous algorithms where the delays are textit{arbitrary} (but bounded by $O(m)$), and the data is accessible by all. Existing asynchronous algorithms with distributed-data setting and arbitrary delays have only been shown to converge in $tilde{O}(n^2kappalog(1/epsilon))$ iterations. We empirically compare on least-squares problems the iteration complexity and wallclock performance of ADSAGA to existing parallel and distributed algorithms, including synchronous minibatch algorithms. Our results demonstrate the wallclock advantage of variance-reduced asynchronous approaches over SGD or synchronous approaches.

Machine Learning Distributed Parallel and Cluster Computing Machine Learning

Parallel and distributed asynchronous adaptive stochastic gradient methods

118 - Yangyang Xu , Yibo Xu , Yonggui Yan 2020

Stochastic gradient methods (SGMs) are the predominant approaches to train deep learning models. The adapti

Optimization and Control Distributed Parallel and Cluster Computing Numerical Analysis

On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging

90 - Chris Junchi Li , Yaodong Yu , Nicolas Loizou 2021

We study the stochastic bilinear minimax optimization problem, presenting an analysis of the Stochastic ExtraGradient (SEG) method with constant step size, and presenting variations of the method that yield favorable convergence. We first note that the last iterate of the basic SEG method only contracts to a fixed neighborhood of the Nash equilibrium, independent of the step size. This contrasts sharply with the standard setting of minimization where standard stochastic algorithms converge to a neighborhood that vanishes in proportion to the square-root (constant) step size. Under the same setting, however, we prove that when augmented with iteration averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure. In the interpolation setting, we achieve an optimal convergence rate up to tight constants. We present numerical experiments that validate our theoretical findings and demonstrate the effectiveness of the SEG method when equipped with iteration averaging and restarting.

Optimization and Control Computer Science and Game Theory Machine Learning

Distributed Picard Iteration

64 - Francisco L. Andrade , Mario A. T. Figueiredo , Jo~ao Xavier 2021

The Picard iteration is widely used to find fixed points of locally contractive (LC) maps. This paper extends the Picard iteration to distributed settings; specifically, we assume the map of which the fixed point is sought to be the average of individual (not necessarily LC) maps held by a set of agents linked by a sparse communication network. An additional difficulty is that the LC map is not assumed to come from an underlying optimization problem, which prevents exploiting strong global properties such as convexity or Lipschitzianity. Yet, we propose a distributed algorithm and prove its convergence, in fact showing that it maintains the linear rate of the standard Picard iteration for the average LC map. As another contribution, our proof imports tools from perturbation theory of linear operators, which, to the best of our knowledge, had not been used before in the theory of distributed computation.

Optimization and Control Distributed Parallel and Cluster Computing

Policy iteration for Hamilton-Jacobi-Bellman equations with control constraints

198 - Sudeep Kundu , Karl Kunisch 2020

Policy iteration is a widely used technique to solve the Hamilton Jacobi Bellman (HJB) equation, which arises from nonlinear optimal feedback control theory. Its convergence analysis has attracted much attention in the unconstrained case. Here we analyze the case with control constraints both for the HJB equations which arise in deterministic and in stochastic control cases. The linear equations in each iteration step are solved by an implicit upwind scheme. Numerical examples are conducted to solve the HJB equation with control constraints and comparisons are shown with the unconstrained cases.

Optimization and Control Numerical Analysis Numerical Analysis

comments

Fetching comments

National Institute of Agronomic Research of Algeria

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

On the Convergence of Asynchronous Parallel Iteration with Unbounded Delays

Ask ChatGPT about the research

No Arabic abstract

Read More