أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Paul Muller

Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

185 - Luke Marris , Paul Muller , Marc Lanctot 2021

Two-player, constant-sum games are well studied in the literature, but there has been limited progress outside of this setting. We propose Joint Policy-Space Response Oracles (JPSRO), an algorithm for training agents in n-player, general-sum extensiv e form games, which provably converges to an equilibrium. We further suggest correlated equilibria (CE) as promising meta-solvers, and propose a novel solution concept Maximum Gini Correlated Equilibrium (MGCE), a principled and computationally efficient family of solutions for solving the correlated equilibrium selection problem. We conduct several experiments using CE meta-solvers for JPSRO and demonstrate convergence on n-player, general-sum games.

أنظمة متعددة العملاء الذكاء الاصطناعي علوم الكمبيوتر ونظرية الألعاب

Game Plan: What AI can do for Football, and What Football can do for AI

409 - Karl Tuyls , Shayegan Omidshafiei , Paul Muller 2020

The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have been applied to football, due to a huge increase in data collection by professional teams, increased computational power, and advances in machine learning, with the goal of better addressing new scientific challenges involved in the analysis of both individual players and coordinated teams behaviors. The research challenges associated with predictive and prescriptive football analytics require new developments and progress at the intersection of statistical learning, game theory, and computer vision. In this paper, we provide an overarching perspective highlighting how the combination of these fields, in particular, forms a unique microcosm for AI research, while offering mutual benefits for professional teams, spectators, and broadcasters in the years to come. We illustrate that this duality makes football analytics a game changer of tremendous value, in terms of not only changing the game of football itself, but also in terms of what this domain can mean for the field of AI. We review the state-of-the-art and exemplify the types of analysis enabled by combining the aforementioned fields, including illustrative examples of counterfactual analysis using predictive models, and the combination of game-theoretic analysis of penalty kicks with statistical learning of player attributes. We conclude by highlighting envisioned downstream impacts, including possibilities for extensions to other sports (real and virtual).

الذكاء الاصطناعي علوم الكمبيوتر ونظرية الألعاب أنظمة متعددة العملاء

A Generalized Training Approach for Multiagent Learning

135 - Paul Muller , Shayegan Omidshafiei , Mark Rowland 2019

This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO). PSRO is general in the sense that it (1) encompasses well-known algorithms such as fictitious play and double oracle as special cases, and (2) in principle applies to general-sum, many-player games. Despite this, prior studies of PSRO have been focused on two-player zero-sum games, a regime wherein Nash equilibria are tractably computable. In moving from two-player zero-sum games to more general settings, computation of Nash equilibria quickly becomes infeasible. Here, we extend the theoretical underpinnings of PSRO by considering an alternative solution concept, $alpha$-Rank, which is unique (thus faces no equilibrium selection issues, unlike Nash) and applies readily to general-sum, many-player settings. We establish convergence guarantees in several games classes, and identify links between Nash equilibria and $alpha$-Rank. We demonstrate the competitive performance of $alpha$-Rank-based PSRO against an exact Nash solver-based PSRO in 2-player Kuhn and Leduc Poker. We then go beyond the reach of prior PSRO applications by considering 3- to 5-player poker games, yielding instances where $alpha$-Rank achieves faster convergence than approximate Nash solvers, thus establishing it as a favorable general games solver. We also carry out an initial empirical validation in MuJoCo soccer, illustrating the feasibility of the proposed approach in another complex domain.

أنظمة متعددة العملاء الذكاء الاصطناعي التعلم الآلي

An Accelerated Analog Neuromorphic Hardware System Emulating NMDA- and Calcium-Based Non-Linear Dendrites

217 - Johannes Schemmel , Laura Kriener , Paul Muller 2017

This paper presents an extension of the BrainScaleS accelerated analog neuromorphic hardware model. The scalable neuromorphic architecture is extended by the support for multi-compartment models and non-linear dendrites. These features are part of a SI{65}{ anometer} prototype ASIC. It allows to emulate different spike types observed in cortical pyramidal neurons: NMDA plateau potentials, calcium and sodium spikes. By replicating some of the structures of these cells, they can be configured to perform coincidence detection within a single neuron. Built-in plasticity mechanisms can modify not only the synaptic weights, but also the dendritic synaptic composition to efficiently train large multi-compartment neurons. Transistor-level simulations demonstrate the functionality of the analog implementation and illustrate analogies to biological measurements.

الحوسبة العصبية والتطورية التقنيات الناشئة

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد