In imperfect-information games, subgame solving is significantly more challenging than in perfect-information games, but in the last few years, such techniques have been developed. They were the key ingredient to the milestone of superhuman play in n
o-limit Texas holdem poker. Current subgame-solving techniques analyze the entire common-knowledge closure of the players current information set, that is, the smallest set of nodes within which it is common knowledge that the current node lies. However, this set is too large to handle in many games. We introduce an approach that overcomes this obstacle, by instead working with only low-order knowledge. Our approach allows an agent, upon arriving at an infoset, to basically prune any node that is no longer reachable, thereby massively reducing the game tree size relative to the common-knowledge subgame. We prove that, as is, our approach can increase exploitability compared to the blueprint strategy. However, we develop three avenues by which safety can be guaranteed. First, safety is guaranteed if the results of subgame solves are incorporated back into the blueprint. Second, we provide a method where safety is achieved by limiting the infosets at which subgame solving is performed. Third, we prove that our approach, when applied at every infoset reached during play, achieves a weaker notion of equilibrium, which we coin affine equilibrium, and which may be of independent interest. We show that affine equilibria cannot be exploited by any Nash strategy of the opponent, so an opponent who wishes to exploit must open herself to counter-exploitation. Even without the safety-guaranteeing additions, experiments on medium-sized games show that our approach always reduced exploitability even when applied at every infoset, and a depth-limited version of it led to--to our knowledge--the first strong AI for the massive challenge problem dark chess.
Cooperative multi-agent reinforcement learning often requires decentralised policies, which severely limit the agents ability to coordinate their behaviour. In this paper, we show that common knowledge between agents allows for complex decentralised
coordination. Common knowledge arises naturally in a large number of decentralised cooperative multi-agent tasks, for example, when agents can reconstruct parts of each others observations. Since agents an independently agree on their common knowledge, they can execute complex coordinated policies that condition on this knowledge in a fully decentralised fashion. We propose multi-agent common knowledge reinforcement learning (MACKRL), a novel stochastic actor-critic algorithm that learns a hierarchical policy tree. Higher levels in the hierarchy coordinate groups of agents by conditioning on their common knowledge, or delegate to lower levels with smaller subgroups but potentially richer common knowledge. The entire policy tree can be executed in a fully decentralised fashion. As the lowest policy tree level consists of independent policies for each agent, MACKRL reduces to independently learnt decentralised policies as a special case. We demonstrate that our method can exploit common knowledge for superior performance on complex decentralised coordination tasks, including a stochastic matrix game and challenging problems in StarCraft II unit micromanagement.
Gossip protocols aim at arriving, by means of point-to-point or group communications, at a situation in which all the agents know each other secrets. Recently a number of authors studied distributed epistemic gossip protocols. These protocols use as
guards formulas from a simple epistemic logic, which makes their analysis and verification substantially easier. We study here common knowledge in the context of such a logic. First, we analyze when it can be reduced to iterated knowledge. Then we show that the semantics and truth for formulas without nested common knowledge operator are decidable. This implies that implementability, partial correctness and termination of distributed epistemic gossip protocols that use non-nested common knowledge operator is decidable, as well. Given that common knowledge is equivalent to an infinite conjunction of nested knowledge, these results are non-trivial generalizations of the corresponding decidability results for the original epistemic logic, established in (Apt & Wojtczak, 2016). K. R. Apt & D. Wojtczak (2016): On Decidability of a Logic of Gossips. In Proc. of JELIA 2016, pp. 18-33, doi:10.1007/ 978-3-319-48758-8_2.
The task of identifying and reasoning with circumstantial preconditions associated with everyday facts is natural to humans. It is unclear whether state-of-the-art language models (LMs) understand the implicit preconditions that enable or invalidate
commonsense facts, such as A glass is used for drinking water, Despite their impressive accuracy on existing commonsense tasks. In this paper, we propose a new problem of reasoning with circumstantial preconditions, and present a dataset, called CoreQuisite, which annotates commonsense facts with preconditions expressed in natural language. Based on this resource, we create three canonical evaluation tasks and use them to examine the capability of existing LMs to understand situational pre-conditions. Our results show that there is a 10-30%gap between machine and human performance on our tasks. We make all resources and software publicly available.
This preprint is the extended version of a paper that will be published in the proceedings of the Oberwolfach conference Explicit vs tacit knowledge in mathematics (January 2012). It presents a case study on some algebraic researches at the turn of t
he twentieth century that involved mainly French and American authors. By investigating the collective dimensions of these works, this paper sheds light on the tension between the tacit and the explicit in the ways some groups of texts hold together, thereby constituting some shared algebraic cultures. Although prominent algebraists such as Dickson made extensive references to papers published in France, and despite the roles played by algebra and arithmetic in the development of the American mathematical community, our knowledge of the circulations of knowledge between France and the United States at the beginning of the 20th century is still very limited. It is my aim to tackle such issues through the case study of a specific collective approach to finite group theory at the turn of the 20th century. This specific approach can be understood as a shared algebraic culture based on the long run circulation of some specific procedures of decompositions of the analytic forms of substitutions. In this context, the general linear group was introduced as the maximal group in which an elementary abelian group (i.e., the multiplicative group of a Galois field) is a normal subgroup.