No Arabic abstract
The theory of belief functions manages uncertainty and also proposes a set of combination rules to aggregate opinions of several sources. Some combination rules mix evidential information where sources are independent; other rules are suited to combine evidential information held by dependent sources. In this paper we have two main contributions: First we suggest a method to quantify sources degree of independence that may guide the choice of the more appropriate set of combination rules. Second, we propose a new combination rule that takes consideration of sources degree of independence. The proposed method is illustrated on generated mass functions.
Defining and modeling the relation of inclusion between continuous belief function may be considered as an important operation in order to study their behaviors. Within this paper we will propose and present two forms of inclusion: The strict and the partial one. In order to develop this relation, we will study the case of consonant belief function. To do so, we will simulate normal distributions allowing us to model and analyze these relations. Based on that, we will determine the parameters influencing and characterizing the two forms of inclusion.
Search is an important tool for computing effective policies in single- and multi-agent environments, and has been crucial for achieving superhuman performance in several benchmark fully and partially observable games. However, one major limitation of prior search approaches for partially observable environments is that the computational cost scales poorly with the amount of hidden information. In this paper we present emph{Learned Belief Search} (LBS), a computationally efficient search procedure for partially observable environments. Rather than maintaining an exact belief distribution, LBS uses an approximate auto-regressive counterfactual belief that is learned as a supervised task. In multi-agent settings, LBS uses a novel public-private model architecture for underlying policies in order to efficiently evaluate these policies during rollouts. In the benchmark domain of Hanabi, LBS can obtain 55% ~ 91% of the benefit of exact search while reducing compute requirements by $35.8 times$ ~ $4.6 times$, allowing it to scale to larger settings that were inaccessible to previous search methods.
The standard problem setting in Dec-POMDPs is self-play, where the goal is to find a set of policies that play optimally together. Policies learned through self-play may adopt arbitrary conventions and implicitly rely on multi-step reasoning based on fragile assumptions about other agents actions and thus fail when paired with humans or independently trained agents at test time. To address this, we present off-belief learning (OBL). At each timestep OBL agents follow a policy $pi_1$ that is optimized assuming past actions were taken by a given, fixed policy ($pi_0$), but assuming that future actions will be taken by $pi_1$. When $pi_0$ is uniform random, OBL converges to an optimal policy that does not rely on inferences based on other agents behavior (an optimal grounded policy). OBL can be iterated in a hierarchy, where the optimal policy from one level becomes the input to the next, thereby introducing multi-level cognitive reasoning in a controlled manner. Unlike existing approaches, which may converge to any equilibrium policy, OBL converges to a unique policy, making it suitable for zero-shot coordination (ZSC). OBL can be scaled to high-dimensional settings with a fictitious transition mechanism and shows strong performance in both a toy-setting and the benchmark human-AI & ZSC problem Hanabi.
In this work, we introduce a new approach for the efficient solution of autonomous decision and planning problems, with a special focus on decision making under uncertainty and belief space planning (BSP) in high-dimensional state spaces. Usually, to solve the decision problem, we identify the optimal action, according to some objective function. We claim that we can sometimes generate and solve an analogous yet simplified decision problem, which can be solved more efficiently; a wise simplification method can lead to the same action selection, or one for which the maximal loss can be guaranteed. Furthermore, such simplification is separated from the state inference, and does not compromise its accuracy, as the selected action would finally be applied on the original state. First, we present the concept for general decision problems, and provide a theoretical framework for a coherent formulation of the approach. We then practically apply these ideas to BSP problems, which can be simplified by considering a sparse approximation of the initial (Gaussian) belief. The scalable belief sparsification algorithm we provide is able to yield solutions which are guaranteed to be consistent with the original problem. We demonstrate the benefits of the approach in the solution of a highly realistic active-SLAM problem, and manage to significantly reduce computation time, with practically no loss in the quality of solution. This work is conceptual and fundamental, and holds numerous possible extensions.
We present and discuss a mixed conjunctive and disjunctive rule, a generalization of conflict repartition rules, and a combination of these two rules. In the belief functions theory one of the major problem is the conflict repartition enlightened by the famous Zadehs example. To date, many combination rules have been proposed in order to solve a solution to this problem. Moreover, it can be important to consider the specificity of the responses of the experts. Since few year some unification rules are proposed. We have shown in our previous works the interest of the proportional conflict redistribution rule. We propose here a mixed combination rule following the proportional conflict redistribution rule modified by a discounting procedure. This rule generalizes many combination rules.