ترغب بنشر مسار تعليمي؟ اضغط هنا

Back to the Future: Efficient, Time-Consistent Solutions in Reach-Avoid Games

479   0   0.0 ( 0 )
 نشر من قبل David Fridovich-Keil
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

We study the class of reach-avoid dynamic games in which multiple agents interact noncooperatively, and each wishes to satisfy a distinct target condition while avoiding a failure condition. Reach-avoid games are commonly used to express safety-critical optimal control problems found in mobile robot motion planning. While a wide variety of approaches exist for these motion planning problems, we focus on finding time-consistent solutions, in which planned future motion is still optimal despite prior suboptimal actions. Though abstract, time consistency encapsulates an extremely desirable property: namely, time-consistent motion plans remain optimal even when a robots motion diverges from the plan early on due to, e.g., intrinsic dynamic uncertainty or extrinsic environment disturbances. Our main contribution is a computationally-efficient algorithm for multi-agent reach-avoid games which renders time-consistent solutions. We demonstrate our approach in a simulated driving scenario, where we construct a two-player adversarial game to model a range of defensive driving behaviors.



قيم البحث

اقرأ أيضاً

The probabilistic reachability problems of nondeterministic systems are studied. Based on the existing studies, the definition of probabilistic reachable sets is generalized by taking into account time-varying target set and obstacle. A numerical met hod is proposed to compute probabilistic reachable sets. First, a scalar function in the state space is constructed by backward recursion and grid interpolation, and then the probability reachable set is represented as a nonzero level set of this scalar function. In addition, based on the constructed scalar function, the optimal control policy can be designed. At the end of this paper, some examples are taken to illustrate the validity and accuracy of the proposed method.
In this paper, we present a method for finding approximate Nash equilibria in a broad class of reachability games. These games are often used to formulate both collision avoidance and goal satisfaction. Our method is computationally efficient, runnin g in real-time for scenarios involving multiple players and more than ten state dimensions. The proposed approach forms a family of increasingly exact approximations to the original game. Our results characterize the quality of these approximations and show operation in a receding horizon, minimally-invasive control context. Additionally, as a special case, our method reduces to local gradient-based optimization in the single-player (optimal control) setting, for which a wide variety of efficient algorithms exist.
In the current control design of safety-critical autonomous systems, formal verification techniques are typically applied after the controller is designed to evaluate whether the required properties (e.g., safety) are satisfied. However, due to the i ncreasing system complexity and the fundamental hardness of designing a controller with formal guarantees, such an open-loop process of design-then-verify often results in many iterations and fails to provide the necessary guarantees. In this paper, we propose a correct-by-construction control learning framework that integrates the verification into the control design process in a closed-loop manner, i.e., design-while-verify. Specifically, we leverage the verification results (computed reachable set of the system state) to construct feedback metrics for control learning, which measure how likely the current design of control parameters can meet the required reach-avoid property for safety and goal-reaching. We formulate an optimization problem based on such metrics for tuning the controller parameters, and develop an approximated gradient descent algorithm with a difference method to solve the optimization problem and learn the controller. The learned controller is formally guaranteed to meet the required reach-avoid property. By treating verifiability as a first-class objective and effectively leveraging the verification results during the control learning process, our approach can significantly improve the chance of finding a control design with formal property guarantees. This is demonstrated via a set of experiments on both linear and non-linear systems that use model-based or neural network based controllers.
This paper studies a planar multiplayer Homicidal Chauffeur reach-avoid differential game, where each pursuer is a Dubins car and each evader has simple motion. The pursuers aim to protect a goal region cooperatively from the evaders. Due to the high -dimensional strategy space among pursuers, we decompose the whole game into multiple one-pursuer-one-evader subgames, each of which is solved in an analytical approach instead of solving Hamilton-Jacobi-Isaacs equations. For each subgame, an evasion region (ER) is introduced, based on which a pursuit strategy guaranteeing the winning of a simple-motion pursuer under specific conditions is proposed. Motivated by the simple-motion pursuer, a strategy for a Dubins-car pursuer is proposed when the pursuer-evader configuration satisfies separation condition (SC) and interception orientation (IO). The necessary and sufficient condition on capture radius, minimum turning radius and speed ratio to guarantee the pursuit winning is derived. When the IO is not satisfied (Non-IO), a heading adjustment pursuit strategy is proposed, and the condition to achieve IO within a finite time, is given. Then, a two-step pursuit strategy is proposed for the SC and Non-IO case. A non-convex optimization problem is introduced to give a condition guaranteeing the winning of the pursuer. A polynomial equation gives a lower bound of the non-convex problem, providing a sufficient and efficient pursuit winning condition. Finally, these pairwise outcomes are collected for the pursuer-evader matching. Simulations are provided to illustrate the theoretical results.
While the topic of mean-field games (MFGs) has a relatively long history, heretofore there has been limited work concerning algorithms for the computation of equilibrium control policies. In this paper, we develop a computable policy iteration algori thm for approximating the mean-field equilibrium in linear-quadratic MFGs with discounted cost. Given the mean-field, each agent faces a linear-quadratic tracking problem, the solution of which involves a dynamical system evolving in retrograde time. This makes the development of forward-in-time algorithm updates challenging. By identifying a structural property of the mean-field update operator, namely that it preserves sequences of a particular form, we develop a forward-in-time equilibrium computation algorithm. Bounds that quantify the accuracy of the computed mean-field equilibrium as a function of the algorithms stopping condition are provided. The optimality of the computed equilibrium is validated numerically. In contrast to the most recent/concurrent results, our algorithm appears to be the first to study infinite-horizon MFGs with non-stationary mean-field equilibria, though with focus on the linear quadratic setting.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا