بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

A Dynamics Perspective of Pursuit-Evasion Games of Intelligent Agents with the Ability to Learn

137 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hao Xiong

تاريخ النشر 2021

مجال البحث هندسة إلكترونية الهندسة المعلوماتية

والبحث باللغة English

تأليف Hao Xiong - Huanhui Cao - Lin Zhang

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Pursuit-evasion games are ubiquitous in nature and in an artificial world. In nature, pursuer(s) and evader(s) are intelligent agents that can learn from experience, and dynamics (i.e., Newtonian or Lagrangian) is vital for the pursuer and the evader in some scenarios. To this end, this paper addresses the pursuit-evasion game of intelligent agents from the perspective of dynamics. A bio-inspired dynamics formulation of a pursuit-evasion game and baseline pursuit and evasion strategies are introduced at first. Then, reinforcement learning techniques are used to mimic the ability of intelligent agents to learn from experience. Based on the dynamics formulation and reinforcement learning techniques, the effects of improving both pursuit and evasion strategies based on experience on pursuit-evasion games are investigated at two levels 1) individual runs and 2) ranges of the parameters of pursuit-evasion games. Results of the investigation are consistent with nature observations and the natural law - survival of the fittest. More importantly, with respect to the result of a pursuit-evasion game of agents with baseline strategies, this study achieves a different result. It is shown that, in a pursuit-evasion game with a dynamics formulation, an evader is not able to escape from a slightly faster pursuer with an effective learned pursuit strategy, based on agile maneuvers and an effective learned evasion strategy.

قيم البحث

99 - Mohammadreza Doostmohammadian , Alireza Aghasi , Themistoklisn Charalambous 2021

In this paper, a general nonlinear 1st-order consensus-based solution for distributed constrained convex optimization is considered for applications in network resource allocation. The proposed continuous-time solution is used to optimize continuousl y-differentiable strictly convex cost functions over weakly-connected undirected multi-agent networks. The solution is anytime feasible and models various nonlinearities to account for imperfections and constraints on the (physical model of) agents in terms of their limited actuation capabilities, e.g., quantization and saturation constraints among others. Moreover, different applications impose specific nonlinearities to the model, e.g., convergence in fixed/finite-time, robustness to uncertainties, and noise-tolerant dynamics. Our proposed distributed resource allocation protocol generalizes such nonlinear models. Putting convex set analysis together with the Lyapunov theorem, we provide a general technique to prove convergence (i) regardless of the particular type of nonlinearity (ii) with weak network-connectivity requirement (i.e., uniform-connectivity). We simulate the performance of the protocol in continuous-time coordination of generators, known as the economic dispatch problem (EDP).

أنظمة وتحكم أنظمة متعددة العملاء أنظمة وتحكم

A decomposition technique for pursuit evasion games with many pursuers

361 - Adriano Festa , Richard B. Vinter 2013

Here we present a decomposition technique for a class of differential games. The technique consists in a decomposition of the target set which produces, for geometrical reasons, a decomposition in the dimensionality of the problem. Using some element s of Hamilton-Jacobi equations theory, we find a relation between the regularity of the solution and the possibility to decompose the problem. We use this technique to solve a pursuit evasion game with multiple agents.

التحسين والتحكم تحليل PDES

Using Intelligent Agents to understand organisational behaviour

476 - Helen Celia , Christopher Clegg , Mark Robinson 2008

This paper introduces two ongoing research projects which seek to apply computer modelling techniques in order to simulate human behaviour within organisations. Previous research in other disciplines has suggested that complex social behaviours are g overned by relatively simple rules which, when identified, can be used to accurately model such processes using computer technology. The broad objective of our research is to develop a similar capability within organisational psychology.

الحوسبة العصبية والتطورية أنظمة متعددة العملاء

Iterative Best Response for Multi-Body Asset-Guarding Games

136 - Emmanuel Sin , Murat Arcak , Douglas Philbrick 2020

We present a numerical approach to finding optimal trajectories for players in a multi-body, asset-guarding game with nonlinear dynamics and non-convex constraints. Using the Iterative Best Response (IBR) scheme, we solve for each players optimal str ategy assuming the other players trajectories are known and fixed. Leveraging recent advances in Sequential Convex Programming (SCP), we use SCP as a subroutine within the IBR algorithm to efficiently solve an approximation of each players constrained trajectory optimization problem. We apply the approach to an asset-guarding game example involving multiple pursuers and a single evader (i.e., n-versus-1 engagements). Resulting evader trajectories are tested in simulation to verify successful evasion against pursuers using conventional intercept guidance laws.

أنظمة وتحكم أنظمة متعددة العملاء أنظمة وتحكم

Approximate Equilibrium Computation for Discrete-Time Linear-Quadratic Mean-Field Games

134 - Muhammad Aneeq uz Zaman , Kaiqing Zhang , Erik Miehling 2020

While the topic of mean-field games (MFGs) has a relatively long history, heretofore there has been limited work concerning algorithms for the computation of equilibrium control policies. In this paper, we develop a computable policy iteration algori thm for approximating the mean-field equilibrium in linear-quadratic MFGs with discounted cost. Given the mean-field, each agent faces a linear-quadratic tracking problem, the solution of which involves a dynamical system evolving in retrograde time. This makes the development of forward-in-time algorithm updates challenging. By identifying a structural property of the mean-field update operator, namely that it preserves sequences of a particular form, we develop a forward-in-time equilibrium computation algorithm. Bounds that quantify the accuracy of the computed mean-field equilibrium as a function of the algorithms stopping condition are provided. The optimality of the computed equilibrium is validated numerically. In contrast to the most recent/concurrent results, our algorithm appears to be the first to study infinite-horizon MFGs with non-stationary mean-field equilibria, though with focus on the linear quadratic setting.

أنظمة وتحكم أنظمة متعددة العملاء أنظمة وتحكم

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة الأميركية في بيروت

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Dynamics Perspective of Pursuit-Evasion Games of Intelligent Agents with the Ability to Learn

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً