ﻻ يوجد ملخص باللغة العربية
Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. This paper proposes a method called PA-POMCPOW to sample a subset of the action space that provides varying mixtures of exploitation and exploration for inclusion in a search tree. The proposed method first evaluates the action space according to a score function that is a linear combination of expected reward and expected information gain. The actions with the highest score are then added to the search tree during tree expansion. Experiments show that PA-POMCPOW is able to outperform existing state-of-the-art solvers on problems with large discrete action spaces.
Standard planners for sequential decision making (including Monte Carlo planning, tree search, dynamic programming, etc.) are constrained by an implicit sequential planning assumption: The order in which a plan is constructed is the same in which it
One-shot neural architecture search (NAS) methods significantly reduce the search cost by considering the whole search space as one network, which only needs to be trained once. However, current methods select each operation independently without con
This paper presents a solution to Autonomous Underwater Vehicles (AUVs) large scale route planning and task assignment joint problem. Given a set of constraints (e.g., time) and a set of task priority values, the goal is to find the optimal route for
Retrosynthetic planning is a critical task in organic chemistry which identifies a series of reactions that can lead to the synthesis of a target product. The vast number of possible chemical transformations makes the size of the search space very bi
Efficient driving in urban traffic scenarios requires foresight. The observation of other traffic participants and the inference of their possible next actions depending on the own action is considered cooperative prediction and planning. Humans are