ترغب بنشر مسار تعليمي؟ اضغط هنا

Information compression is essential to reduce communication cost in distributed optimization over peer-to-peer networks. This paper proposes a communication-efficient linearly convergent distributed (COLD) algorithm to solve strongly convex optimiza tion problems. By compressing innovation vectors, which are the differences between decision vectors and their estimates, COLD is able to achieve linear convergence for a class of $delta$-contracted compressors. We explicitly quantify how the compression affects the convergence rate and show that COLD matches the same rate of its uncompressed version. To accommodate a wider class of compressors that includes the binary quantizer, we further design a novel dynamical scaling mechanism and obtain the linearly convergent Dyna-COLD. Importantly, our results strictly improve existing results for the quantized consensus problem. Numerical experiments demonstrate the advantages of both algorithms under different compressors.
This paper considers a networked aggregative game (NAG) where the players are distributed over a communication network. By only communicating with a subset of players, the goal of each player in the NAG is to minimize an individual cost function that depends on its own action and the aggregate of all the players actions. To this end, we design a novel distributed algorithm that jointly exploits the ideas of the consensus algorithm and the conditional projection descent. Under strongly monotone assumption on the pseudo-gradient mapping, the proposed algorithm with fixed step-sizes is proved to converge linearly to the unique Nash equilibrium of the NAG. Then the theoretical results are validated by numerical experiments.
While the techniques in optimal control theory are often model-based, the policy optimization (PO) approach can directly optimize the performance metric of interest without explicit dynamical models, and is an essential approach for reinforcement lea rning problems. However, it usually leads to a non-convex optimization problem in most cases, where there is little theoretical understanding on its performance. In this paper, we focus on the risk-constrained Linear Quadratic Regulator (LQR) problem with noisy input via the PO approach, which results in a challenging non-convex problem. To this end, we first build on our earlier result that the optimal policy has an affine structure to show that the associated Lagrangian function is locally gradient dominated with respect to the policy, based on which we establish strong duality. Then, we design policy gradient primal-dual methods with global convergence guarantees to find an optimal policy-multiplier pair in both model-based and sample-based settings. Finally, we use samples of system trajectories in simulations to validate our policy gradient primal-dual methods.
Communication efficiency is a major bottleneck in the applications of distributed networks. To address the problem, the problem of quantized distributed optimization has attracted a lot of attention. However, most of the existing quantized distribute d optimization algorithms can only converge sublinearly. To achieve linear convergence, this paper proposes a novel quantized distributed gradient tracking algorithm (Q-DGT) to minimize a finite sum of local objective functions over directed networks. Moreover, we explicitly derive the update rule for the number of quantization levels, and prove that Q-DGT can converge linearly even when the exchanged variables are respectively one bit. Numerical results also confirm the efficiency of the proposed algorithm.
The behaviour of a stochastic dynamical system may be largely influenced by those low-probability, yet extreme events. To address such occurrences, this paper proposes an infinite-horizon risk-constrained Linear Quadratic Regulator (LQR) framework wi th time-average cost. In addition to the standard LQR objective, the average one-stage predictive variance of the state penalty is constrained to lie within a user-specified level. By leveraging the duality, its optimal solution is first shown to be stationary and affine in the state, i.e., $u(x,lambda^*) = -K(lambda^*)x + l(lambda^*)$, where $lambda^*$ is an optimal multiplier, used to address the risk constraint. Then, we establish the stability of the resulting closed-loop system. Furthermore, we propose a primal-dual method with sublinear convergence rate to find an optimal policy $u(x,lambda^*)$. Finally, a numerical example is provided to demonstrate the effectiveness of the proposed framework and the primal-dual method.
We propose a fully asynchronous networked aggregative game (Asy-NAG) where each player minimizes a cost function that depends on its local action and the aggregate of all players actions. In sharp contrast to the existing NAGs, each player in our Asy -NAG can compute an estimate of the aggregate action at any wall-clock time by only using (possibly stale) information from nearby players of a directed network. Such an asynchronous update does not require any coordination among players. Moreover, we design a novel distributed algorithm with an aggressive mechanism for each player to adaptively adjust the optimization stepsize per update. Particularly, the slow players in terms of updating their estimates smartly increase their stepsizes to catch up with the fast ones. Then, we develop an augmented system approach to address the asynchronicity and the information delays between players, and rigorously show the convergence to a Nash equilibrium of the Asy-NAG via a perturbed coordinate algorithm which is also of independent interest. Finally, we evaluate the performance of the distributed algorithm through numerical simulations.
205 - Feiran Zhao , Keyou You 2020
Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller with a focus on the linear system. We formulate it as a disc rete-time infinite-horizon LQR problem with a state predictive variance constraint. To solve it, we parameterize the policy with a feedback gain pair and leverage primal-dual methods to optimize it by solely using data. We first study the optimization landscape of the Lagrangian function and establish the strong duality in spite of its non-convex nature. Alongside, we find that the Lagrangian function enjoys an important local gradient dominance property, which is then exploited to develop a convergent random search algorithm to learn the dual function. Furthermore, we propose a primal-dual algorithm with global convergence to learn the optimal policy-multiplier pair. Finally, we validate our results via simulations.
170 - Feiran Zhao , Keyou You 2020
Optimal control of a stochastic dynamical system usually requires a good dynamical model with probability distributions, which is difficult to obtain due to limited measurements and/or complicated dynamics. To solve it, this work proposes a data-driv en distributionally robust control framework with the Wasserstein metric via a constrained two-player zero-sum Markov game, where the adversarial player selects the probability distribution from a Wasserstein ball centered at an empirical distribution. Then, the game is approached by its penalized version, an optimal stabilizing solution of which is derived explicitly in a linear structure under the Riccati-type iterations. Moreover, we design a model-free Q-learning algorithm with global convergence to learn the optimal controller. Finally, we verify the effectiveness of the proposed learning algorithm and demonstrate its robustness to the probability distribution errors via numerical examples.
62 - Fei Dong , Keyou You 2020
The isoline tracking of this work is concerned with the control design for a sensing vehicle to track a desired isoline of an unknown scalar field. To this end, we propose a simple PI-like controller for a Dubins vehicle in the GPS-denied environment s. Our key idea lies in the design of a novel sliding surface based error in the standard PI controller. For the circular field, we show that the P-like controller can globally regulate the vehicle to the desired isoline with the steady-state error that can be arbitrarily reduced by increasing the P gain, and is eliminated by the PI-like controller. For any smoothing field, the P-like controller is able to achieve the local regulation. Then, it is extended to the cases of a single-integrator vehicle and a doubleintegrator vehicle, respectively. Finally, the effectiveness and advantages of our approaches are validated via simulations on the fixed-wing UAV and quadrotor simulators.
53 - Fei Dong , Keyou You 2020
The isoline tracking of this work is concerned with the control design for a sensing robot to track a given isoline of an unknown 2-D scalar filed. To this end, we propose a coordinate-free controller with a simple PI-like form using only the concent ration feedback for a Dubins robot, which is particularly useful in GPS-denied environments. The key idea lies in the novel design of a sliding surface based error term in the standard PI controller. Interestingly, we also prove that the tracking error can be reduced by increasing the proportion gain, and is eliminated for circular fields with a non-zero integral gain. The effectiveness of our controller is validated via simulations by using a fixed-wing UAV on the real dataset of the concentration distribution of PM 2.5 in Handan, China.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا