ترغب بنشر مسار تعليمي؟ اضغط هنا

Towards an $O(frac{1}{t})$ convergence rate for distributed dual averaging

358   0   0.0 ( 0 )
 نشر من قبل Changxin Liu
 تاريخ النشر 2019
  مجال البحث
والبحث باللغة English




اسأل ChatGPT حول البحث

Recently, distributed dual averaging has received increasing attention due to its superiority in handling constraints and dynamic networks in multiagent optimization. However, all distributed dual averaging methods reported so far considered nonsmooth problems and have a convergence rate of $O(frac{1}{sqrt{t}})$. To achieve an improved convergence guarantee for smooth problems, this work proposes a second-order consensus scheme that assists each agent to locally track the global dual variable more accurately. This new scheme in conjunction with smoothness of the objective ensures that the accumulation of consensus error over time caused by incomplete global information is bounded from above. Then, a rigorous investigation of dual averaging with inexact gradient oracles is carried out to compensate the consensus error and achieve an $O(frac{1}{t})$ convergence rate. The proposed method is examined in a large-scale LASSO problem.



قيم البحث

اقرأ أيضاً

A collection of optimization problems central to power system operation requires distributed solution architectures to avoid the need for aggregation of all information at a central location. In this paper, we study distributed dual subgradient metho ds to solve three such optimization problems. Namely, these are tie-line scheduling in multi-area power systems, coordination of distributed energy resources in radial distribution networks, and joint dispatch of transmission and distribution assets. With suitable relaxations or approximations of the power flow equations, all three problems can be reduced to a multi-agent constrained convex optimization problem. We utilize a constant step-size dual subgradient method with averaging on these problems. For this algorithm, we provide a convergence guarantee that is shown to be order-optimal. We illustrate its application on the grid optimization problems.
Considering the constrained stochastic optimization problem over a time-varying random network, where the agents are to collectively minimize a sum of objective functions subject to a common constraint set, we investigate asymptotic properties of a d istributed algorithm based on dual averaging of gradients. Different from most existing works on distributed dual averaging algorithms that mainly concentrating on their non-asymptotic properties, we not only prove almost sure convergence and the rate of almost sure convergence, but also asymptotic normality and asymptotic efficiency of the algorithm. Firstly, for general constrained convex optimization problem distributed over a random network, we prove that almost sure consensus can be archived and the estimates of agents converge to the same optimal point. For the case of linear constrained convex optimization, we show that the mirror map of the averaged dual sequence identifies the active constraints of the optimal solution with probability 1, which helps us to prove the almost sure convergence rate and then establish asymptotic normality of the algorithm. Furthermore, we also verify that the algorithm is asymptotically optimal. To the best of our knowledge, it seems to be the first asymptotic normality result for constrained distributed optimization algorithms. Finally, a numerical example is provided to justify the theoretical analysis.
128 - Ermin Wei , Asuman Ozdaglar 2013
We consider a network of agents that are cooperatively solving a global optimization problem, where the objective function is the sum of privately known local objective functions of the agents and the decision variables are coupled via linear constra ints. Recent literature focused on special cases of this formulation and studied their distributed solution through either subgradient based methods with O(1/sqrt(k)) rate of convergence (where k is the iteration number) or Alternating Direction Method of Multipliers (ADMM) based methods, which require a synchronous implementation and a globally known order on the agents. In this paper, we present a novel asynchronous ADMM based distributed method for the general formulation and show that it converges at the rate O(1/k).
In recent years, variance-reducing stochastic methods have shown great practical performance, exhibiting linear convergence rate when other stochastic methods offered a sub-linear rate. However, as datasets grow ever bigger and clusters become widesp read, the need for fast distribution methods is pressing. We propose here a distribution scheme for SAGA which maintains a linear convergence rate, even when communication between nodes is limited.
This paper studies decentralized convex optimization problems defined over networks, where the objective is to minimize a sum of local smooth convex functions while respecting a common constraint. Two new algorithms based on dual averaging and decent ralized consensus-seeking are proposed. The first one accelerates the standard convergence rate $O(frac{1}{sqrt{t}})$ in existing decentralized dual averaging (DDA) algorithms to $O(frac{1}{t})$, where $t$ is the time counter. This is made possible by a second-order consensus scheme that assists each agent to locally track the global dual variable more accurately and a new analysis of the descent property for the mean variable. We remark that, in contrast to its primal counterparts, this method decouples the synchronization step from nonlinear projection, leading to a rather concise analysis and a natural extension to stochastic networks. In the second one, two local sequences of primal variables are constructed in a decentralized manner to achieve acceleration, where only one of them is exchanged between agents. In addition to this, another consensus round is performed for local dual variables. The convergence rate is proved to be $O(1)(frac{1}{t^2}+frac{1}{t})$, where the magnitude of error bound is showed to be inversely proportional to the algebraic connectivity of the graph. However, the condition for stepsize does not rely on the weight matrix associated with the graph, making it easier to satisfy in practice than other accelerated methods. Finally, comparisons between the proposed methods and several recent algorithms are performed using a large-scale LASSO problem.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا