ترغب بنشر مسار تعليمي؟ اضغط هنا

Context-Aware Deep Q-Network for Decentralized Cooperative Reconnaissance by a Robotic Swarm

109   0   0.0 ( 0 )
 نشر من قبل Mohitvishnu S. Gadde
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

One of the crucial problems in robotic swarm-based operation is to search and neutralize heterogeneous targets in an unknown and uncertain environment, without any communication within the swarm. Here, some targets can be neutralized by a single robot, while others need multiple robots in a particular sequence to neutralize them. The complexity in the problem arises due to the scalability and information uncertainty, which restricts the robots awareness of the swarm and the target distribution. In this paper, this problem is addressed by proposing a novel Context-Aware Deep Q-Network (CA-DQN) framework to obtain communication free cooperation between the robots in the swarm. Each robot maintains an adaptive grid representation of the vicinity with the context information embedded into it to keep the swarm intact while searching and neutralizing the targets. The problem formulation uses a reinforcement learning framework where two Deep Q-Networks (DQNs) handle conflict and conflict-free scenarios separately. The self-play-in-based approach is used to determine the optimal policy for the DQNs. Monte-Carlo simulations and comparison studies with a state-of-the-art coalition formation algorithm are performed to verify the performance of CA-DQN with varying environmental parameters. The results show that the approach is invariant to the number of detected targets and the number of robots in the swarm. The paper also presents the real-time implementation of CA-DQN for different scenarios using ground robots in a laboratory environment to demonstrate the working of CA-DQN with low-power computing devices.

قيم البحث

اقرأ أيضاً

Multiple robotic systems, working together, can provide important solutions to different real-world applications (e.g., disaster response), among which task allocation problems feature prominently. Very few existing decentralized multi-robotic task a llocation (MRTA) methods simultaneously offer the following capabilities: consideration of task deadlines, consideration of robot range and task completion capacity limitations, and allowing asynchronous decision-making under dynamic task spaces. To provision these capabilities, this paper presents a computationally efficient algorithm that involves novel construction and matching of bipartite graphs. Its performance is tested on a multi-UAV flood response application.
Swarm robotic search is concerned with searching targets in unknown environments (e.g., for search and rescue or hazard localization), using a large number of collaborating simple mobile robots. In such applications, decentralized swarm systems are t outed for their task/coverage scalability, time efficiency, and fault tolerance. To guide the behavior of such swarm systems, two broad classes of approaches are available, namely nature-inspired swarm heuristics and multi-robotic search methods. However, simultaneously offering computationally-efficient scalability and fundamental insights into the exhibited behavior (instead of a black-box behavior model), remains challenging under either of these two class of approaches. In this paper, we develop an important extension of the batch Bayesian search method for application to embodied swarm systems, searching in a physical 2D space. Key contributions lie in: 1) designing an acquisition function that not only balances exploration and exploitation across the swarm, but also allows modeling knowledge extraction over trajectories; and 2) developing its distributed implementation to allow asynchronous task inference and path planning by the swarm robots. The resulting collective informative path planning approach is tested on target search case studies of varying complexity, where the target produces a spatially varying (measurable) signal. Significantly superior performance, in terms of mission completion efficiency, is observed compared to exhaustive search and random walk baselines, along with favorable performance scalability with increasing swarm size.
Decentralized swarm robotic solutions to searching for targets that emit a spatially varying signal promise task parallelism, time efficiency, and fault tolerance. It is, however, challenging for swarm algorithms to offer scalability and efficiency, while preserving mathematical insights into the exhibited behavior. A new decentralized search method (called Bayes-Swarm), founded on batch Bayesian Optimization (BO) principles, is presented here to address these challenges. Unlike swarm heuristics approaches, Bayes-Swarm decouples the knowledge generation and task planning process, thus preserving insights into the emergent behavior. Key contributions lie in: 1) modeling knowledge extraction over trajectories, unlike in BO; 2) time-adaptively balancing exploration/exploitation and using an efficient local penalization approach to account for potential interactions among different robots planned samples; and 3) presenting an asynchronous implementation of the algorithm. This algorithm is tested on case studies with bimodal and highly multimodal signal distributions. Up to 76 times better efficiency is demonstrated compared to an exhaustive search baseline. The benefits of exploitation/exploration balancing, asynchronous planning, and local penalization, and scalability with swarm size, are also demonstrated.
In this paper, we propose a novel two-stage context-aware network named CANet for shadow removal, in which the contextual information from non-shadow regions is transferred to shadow regions at the embedded feature spaces. At Stage-I, we propose a co ntextual patch matching (CPM) module to generate a set of potential matching pairs of shadow and non-shadow patches. Combined with the potential contextual relationships between shadow and non-shadow regions, our well-designed contextual feature transfer (CFT) mechanism can transfer contextual information from non-shadow to shadow regions at different scales. With the reconstructed feature maps, we remove shadows at L and A/B channels separately. At Stage-II, we use an encoder-decoder to refine current results and generate the final shadow removal results. We evaluate our proposed CANet on two benchmark datasets and some real-world shadow images with complex scenes. Extensive experimental results strongly demonstrate the efficacy of our proposed CANet and exhibit superior performance to state-of-the-arts.
Pursuit-evasion is the problem of capturing mobile targets with one or more pursuers. We use deep reinforcement learning for pursuing an omni-directional target with multiple, homogeneous agents that are subject to unicycle kinematic constraints. We use shared experience to train a policy for a given number of pursuers that is executed independently by each agent at run-time. The training benefits from curriculum learning, a sweeping-angle ordering to locally represent neighboring agents and encouraging good formations with reward structure that combines individual and group rewards. Simulated experiments with a reactive evader and up to eight pursuers show that our learning-based approach, with non-holonomic agents, performs on par with classical algorithms with omni-directional agents, and outperforms their non-holonomic adaptations. The learned policy is successfully transferred to the real world in a proof-of-concept demonstration with three motion-constrained pursuer drones.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا