ترغب بنشر مسار تعليمي؟ اضغط هنا

Model-Based Stochastic Search for Large Scale Optimization of Multi-Agent UAV Swarms

58   0   0.0 ( 0 )
 نشر من قبل David D. Fan
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Recent work from the reinforcement learning community has shown that Evolution Strategies are a fast and scalable alternative to other reinforcement learning methods. In this paper we show that Evolution Strategies are a special case of model-based stochastic search methods. This class of algorithms has nice asymptotic convergence properties and known convergence rates. We show how these methods can be used to solve both cooperative and competitive multi-agent problems in an efficient manner. We demonstrate the effectiveness of this approach on two complex multi-agent UAV swarm combat scenarios: where a team of fixed wing aircraft must attack a well-defended base, and where two teams of agents go head to head to defeat each other.

قيم البحث

اقرأ أيضاً

Smart traffic control and management become an emerging application for Deep Reinforcement Learning (DRL) to solve traffic congestion problems in urban networks. Different traffic control and management policies can be tested on the traffic simulatio n. Current DRL-based studies are mainly supported by the microscopic simulation software (e.g., SUMO), while it is not suitable for city-wide control due to the computational burden and gridlock effect. To the best of our knowledge, there is a lack of studies on the large-scale traffic simulator for DRL testbeds, which could further hinder the development of DRL. In view of this, we propose a meso-macro traffic simulator for very large-scale DRL scenarios. The proposed simulator integrates mesoscopic and macroscopic traffic simulation models to improve efficiency and eliminate gridlocks. The mesoscopic link model simulates flow dynamics on roads, and the macroscopic Bathtub model depicts vehicle movement in regions. Moreover, both types of models can be hybridized to accommodate various DRL tasks. This creates portals for mixed transportation applications under different contexts. The result shows that the developed simulator only takes 46 seconds to finish a 24-hour simulation in a very large city with 2.2 million vehicles, which is much faster than SUMO. Additionally, we develop a graphic interface for users to visualize the simulation results in a web explorer. In the future, the developed meso-macro traffic simulator could serve as a new environment for very large-scale DRL problems.
UAV swarms have triggered wide concern due to their potential application values in recent years. While there are studies proposed in terms of the architecture design for UAV swarms, two main challenges still exist: (1) Scalability, supporting a larg e scale of vehicles; (2) Versatility, integrating diversified missions. To this end, a multi-layered and distributed architecture for mission oriented miniature fixed-wing UAV swarms is presented in this paper. The proposed architecture is built on the concept of modularity. It divides the overall system to five layers: low-level control, high-level control, coordination, communication and human interaction layers, and many modules that can be viewed as black boxes with interfaces of inputs and outputs. In this way, not only the complexity of developing a large system can be reduced, but also the versatility of supporting diversified missions can be ensured. Furthermore, the proposed architecture is fully distributed that each UAV performs the decision-making procedure autonomously so as to achieve better scalability. Moreover, different kinds of aerial platforms can be feasibly extended by using the control allocation matrices and the integrated hardware box. A prototype swarm system based on the proposed architecture is built and the proposed architecture is evaluated through field experiments with a scale of 21 fixed-wing UAVs. Particularly, to the best of our knowledge, this paper is the first work which successfully demonstrates formation flight, target recognition and tracking missions within an integrated architecture for fixed-wing UAV swarms through field experiments.
Electricity market modelling is often used by governments, industry and agencies to explore the development of scenarios over differing timeframes. For example, how would the reduction in cost of renewable energy impact investments in gas power plant s or what would be an optimum strategy for carbon tax or subsidies? Cost optimization based solutions are the dominant approach for understanding different long-term energy scenarios. However, these types of models have certain limitations such as the need to be interpreted in a normative manner, and the assumption that the electricity market remains in equilibrium throughout. Through this work, we show that agent-based models are a viable technique to simulate decentralised electricity markets. The aim of this paper is to validate an agent-based modelling framework to increase confidence in its ability to be used in policy and decision making. Our framework can model heterogeneous agents with imperfect information. The model uses a rules-based approach to approximate the underlying dynamics of a real world, decentralised electricity market. We use the UK as a case study, however, our framework is generalisable to other countries. We increase the temporal granularity of the model by selecting representative days of electricity demand and weather using a $k$-means clustering approach. We show that our framework can model the transition from coal to gas observed in the UK between 2013 and 2018. We are also able to simulate a future scenario to 2035 which is similar to the UK Government, Department for Business and Industrial Strategy (BEIS) projections. We show a more realistic increase in nuclear power over this time period. This is due to the fact that with current nuclear technology, electricity is generated almost instantaneously and has a low short-run marginal cost cite{Department2016}.
Simulation of population dynamics is a central research theme in computational biology, which contributes to understanding the interactions between predators and preys. Conventional mathematical tools of this theme, however, are incapable of accounti ng for several important attributes of such systems, such as the intelligent and adaptive behavior exhibited by individual agents. This unrealistic setting is often insufficient to simulate properties of population dynamics found in the real-world. In this work, we leverage multi-agent deep reinforcement learning, and we propose a new model of large-scale predator-prey ecosystems. Using different variants of our proposed environment, we show that multi-agent simulations can exhibit key real-world dynamical properties. To obtain this behavior, we firstly define a mating mechanism such that existing agents reproduce new individuals bound by the conditions of the environment. Furthermore, we incorporate a real-time evolutionary algorithm and show that reinforcement learning enhances the evolution of the agents physical properties such as speed, attack and resilience against attacks.
This work develops effective distributed strategies for the solution of constrained multi-agent stochastic optimization problems with coupled parameters across the agents. In this formulation, each agent is influenced by only a subset of the entries of a global parameter vector or model, and is subject to convex constraints that are only known locally. Problems of this type arise in several applications, most notably in disease propagation models, minimum-cost flow problems, distributed control formulations, and distributed power system monitoring. This work focuses on stochastic settings, where a stochastic risk function is associated with each agent and the objective is to seek the minimizer of the aggregate sum of all risks subject to a set of constraints. Agents are not aware of the statistical distribution of the data and, therefore, can only rely on stochastic approximations in their learning strategies. We derive an effective distributed learning strategy that is able to track drifts in the underlying parameter model. A detailed performance and stability analysis is carried out showing that the resulting coupled diffusion strategy converges at a linear rate to an $O(mu)-$neighborhood of the true penalized optimizer.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا