ترغب بنشر مسار تعليمي؟ اضغط هنا

Linear Quadratic Games with Costly Measurements

50   0   0.0 ( 0 )
 نشر من قبل Dipankar Maity
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In this work we consider a stochastic linear quadratic two-player game. The state measurements are observed through a switched noiseless communication link. Each player incurs a finite cost every time the link is established to get measurements. Along with the usual control action, each player is equipped with a switching action to control the communication link. The measurements help to improve the estimate and hence reduce the quadratic cost but at the same time the cost is increased due to switching. We study the subgame perfect equilibrium control and switching strategies for the players. We show that the problem can be solved in a two-step process by solving two dynamic programming problems. The first step corresponds to solving a dynamic programming for the control strategy and the second step solves another dynamic programming for the switching strategy



قيم البحث

اقرأ أيضاً

We propose a new risk-constrained reformulation of the standard Linear Quadratic Regulator (LQR) problem. Our framework is motivated by the fact that the classical (risk-neutral) LQR controller, although optimal in expectation, might be ineffective u nder relatively infrequent, yet statistically significant (risky) events. To effectively trade between average and extreme event performance, we introduce a new risk constraint, which explicitly restricts the total expected predictive variance of the state penalty by a user-prescribed level. We show that, under rather minimal conditions on the process noise (i.e., finite fourth-order moments), the optimal risk-aware controller can be evaluated explicitly and in closed form. In fact, it is affine relative to the state, and is always internally stable regardless of parameter tuning. Our new risk-aware controller: i) pushes the state away from directions where the noise exhibits heavy tails, by exploiting the third-order moment (skewness) of the noise; ii) inflates the state penalty in riskier directions, where both the noise covariance and the state penalty are simultaneously large. The properties of the proposed risk-aware LQR framework are also illustrated via indicative numerical examples.
We study the problem of learning-augmented predictive linear quadratic control. Our goal is to design a controller that balances consistency, which measures the competitive ratio when predictions are accurate, and robustness, which bounds the competi tive ratio when predictions are inaccurate. We propose a novel $lambda$-confident controller and prove that it maintains a competitive ratio upper bound of $1+min{O(lambda^2varepsilon)+ O(1-lambda)^2,O(1)+O(lambda^2)}$ where $lambdain [0,1]$ is a trust parameter set based on the confidence in the predictions, and $varepsilon$ is the prediction error. Further, we design a self-tuning policy that adaptively learns the trust parameter $lambda$ with a regret that depends on $varepsilon$ and the variation of perturbations and predictions.
We motivate and propose a new model for non-cooperative Markov game which considers the interactions of risk-aware players. This model characterizes the time-consistent dynamic risk from both stochastic state transitions (inherent to the game) and ra ndomized mixed strategies (due to all other players). An appropriate risk-aware equilibrium concept is proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutanis fixed point theorem. We further propose a simulation-based Q-learning type algorithm for risk-aware equilibrium computation. This algorithm works with a special form of minimax risk measures which can naturally be written as saddle-point stochastic optimization problems, and covers many widely investigated risk measures. Finally, the almost sure convergence of this simulation-based algorithm to an equilibrium is demonstrated under some mild conditions. Our numerical experiments on a two player queuing game validate the properties of our model and algorithm, and demonstrate their worth and applicability in real life competitive decision-making.
Controlling network systems has become a problem of paramount importance. Optimally controlling a network system with linear dynamics and minimizing a quadratic cost is a particular case of the well-studied linear-quadratic problem. When the specific topology of the network system is ignored, the optimal controller is readily available. However, this results in a emph{centralized} controller, facing limitations in terms of implementation and scalability. Finding the optimal emph{distributed} controller, on the other hand, is intractable in the general case. In this paper, we propose the use of graph neural networks (GNNs) to parametrize and design a distributed controller. GNNs exhibit many desirable properties, such as being naturally distributed and scalable. We cast the distributed linear-quadratic problem as a self-supervised learning problem, which is then used to train the GNN-based controllers. We also obtain sufficient conditions for the resulting closed-loop system to be input-state stable, and derive an upper bound on the trajectory deviation when the system is not accurately known. We run extensive simulations to study the performance of GNN-based distributed controllers and show that they are computationally efficient and scalable.
The behaviour of a stochastic dynamical system may be largely influenced by those low-probability, yet extreme events. To address such occurrences, this paper proposes an infinite-horizon risk-constrained Linear Quadratic Regulator (LQR) framework wi th time-average cost. In addition to the standard LQR objective, the average one-stage predictive variance of the state penalty is constrained to lie within a user-specified level. By leveraging the duality, its optimal solution is first shown to be stationary and affine in the state, i.e., $u(x,lambda^*) = -K(lambda^*)x + l(lambda^*)$, where $lambda^*$ is an optimal multiplier, used to address the risk constraint. Then, we establish the stability of the resulting closed-loop system. Furthermore, we propose a primal-dual method with sublinear convergence rate to find an optimal policy $u(x,lambda^*)$. Finally, a numerical example is provided to demonstrate the effectiveness of the proposed framework and the primal-dual method.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا