بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

On the Regret Analysis of Online LQR Control with Predictions

71 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Runyu Zhang Ms.

تاريخ النشر 2021

مجال البحث

والبحث باللغة English

تأليف Runyu Zhang - Yingying Li - Na Li

التحسين والتحكم

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this paper, we study the dynamic regret of online linear quadratic regulator (LQR) control with time-varying cost functions and disturbances. We consider the case where a finite look-ahead window of cost functions and disturbances is available at each stage. The online control algorithm studied in this paper falls into the category of model predictive control (MPC) with a particular choice of terminal costs to ensure the exponential stability of MPC. It is proved that the regret of such an online algorithm decays exponentially fast with the length of predictions. The impact of inaccurate prediction on disturbances is also investigated in this paper.

قيم البحث

141 - Chenkai Yu , Guanya Shi , Soon-Jo Chung 2020

We study the impact of predictions in online Linear Quadratic Regulator control with both stochastic and adversarial disturbances in the dynamics. In both settings, we characterize the optimal policy and derive tight bounds on the minimum cost and dy namic regret. Perhaps surprisingly, our analysis shows that the conventional greedy MPC approach is a near-optimal policy in both stochastic and adversarial settings. Specifically, for length-$T$ problems, MPC requires only $O(log T)$ predictions to reach $O(1)$ dynamic regret, which matches (up to lower-order terms) our lower bound on the required prediction horizon for constant regret.

التحسين والتحكم أنظمة وتحكم أنظمة وتحكم

On Distributed Online Convex Optimization with Sublinear Dynamic Regret and Fit

90 - Pranay Sharma , Prashant Khanduri , Lixin Shen 2020

In this work, we consider a distributed online convex optimization problem, with time-varying (potentially adversarial) constraints. A set of nodes, jointly aim to minimize a global objective function, which is the sum of local convex functions. The objective and constraint functions are revealed locally to the nodes, at each time, after taking an action. Naturally, the constraints cannot be instantaneously satisfied. Therefore, we reformulate the problem to satisfy these constraints in the long term. To this end, we propose a distributed primal-dual mirror descent based approach, in which the primal and dual updates are carried out locally at all the nodes. This is followed by sharing and mixing of the primal variables by the local nodes via communication with the immediate neighbors. To quantify the performance of the proposed algorithm, we utilize the challenging, but more realistic metrics of dynamic regret and fit. Dynamic regret measures the cumulative loss incurred by the algorithm, compared to the best dynamic strategy. On the other hand, fit measures the long term cumulative constraint violations. Without assuming the restrictive Slaters conditions, we show that the proposed algorithm achieves sublinear regret and fit under mild, commonly used assumptions.

التحسين والتحكم النظم الموزعة والتوازية والحوسبة العنقودية أنظمة وتحكم

Distributed Online Convex Optimization with Improved Dynamic Regret

77 - Yan Zhang , Robert J. Ravier , Vahid Tarokh 2019

In this paper, we consider the problem of distributed online convex optimization, where a group of agents collaborate to track the global minimizers of a sum of time-varying objective functions in an online manner. Specifically, we propose a novel di stributed online gradient descent algorithm that relies on an online adaptation of the gradient tracking technique used in static optimization. We show that the dynamic regret bound of this algorithm has no explicit dependence on the time horizon and, therefore, can be tighter than existing bounds especially for problems with long horizons. Our bound depends on a new regularity measure that quantifies the total change in the gradients at the optimal points at each time instant. Furthermore, when the optimizer is approximatly subject to linear dynamics, we show that the dynamic regret bound can be further tightened by replacing the regularity measure that captures the path length of the optimizer with the accumulated prediction errors, which can be much lower in this special case. We present numerical experiments to corroborate our theoretical results.

التحسين والتحكم

Perturbation-based Regret Analysis of Predictive Control in Linear Time Varying Systems

94 - Yiheng Lin , Yang Hu , Haoyuan Sun 2021

We study predictive control in a setting where the dynamics are time-varying and linear, and the costs are time-varying and well-conditioned. At each time step, the controller receives the exact predictions of costs, dynamics, and disturbances for th e future $k$ time steps. We show that when the prediction window $k$ is sufficiently large, predictive control is input-to-state stable and achieves a dynamic regret of $O(lambda^k T)$, where $lambda < 1$ is a positive constant. This is the first dynamic regret bound on the predictive control of linear time-varying systems. Under more assumptions on the terminal costs, we also show that predictive control obtains the first competitive bound for the control of linear time-varying systems: $1 + O(lambda^k)$. Our results are derived using a novel proof framework based on a perturbation bound that characterizes how a small change to the system parameters impacts the optimal trajectory.

التحسين والتحكم أنظمة وتحكم أنظمة وتحكم

A Distributed Online Convex Optimization Algorithm with Improved Dynamic Regret

70 - Yan Zhang , Robert J. Ravier , Michael M. Zavlanos 2019

In this paper, we consider the problem of distributed online convex optimization, where a network of local agents aim to jointly optimize a convex function over a period of multiple time steps. The agents do not have any information about the future. Existing algorithms have established dynamic regret bounds that have explicit dependence on the number of time steps. In this work, we show that we can remove this dependence assuming that the local objective functions are strongly convex. More precisely, we propose a gradient tracking algorithm where agents jointly communicate and descend based on corrected gradient steps. We verify our theoretical results through numerical experiments.

التحسين والتحكم التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

معهد تكنولوجيا المعلومات ITI

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

On the Regret Analysis of Online LQR Control with Predictions

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً