No Arabic abstract
The literature on ranking from ordinal data is vast, and there are several ways to aggregate overall preferences from pairwise comparisons between objects. In particular, it is well known that any Nash equilibrium of the zero sum game induced by the preference matrix defines a natural solution concept (winning distribution over objects) known as a von Neumann winner. Many real-world problems, however, are inevitably multi-criteria, with different pairwise preferences governing the different criteria. In this work, we generalize the notion of a von Neumann winner to the multi-criteria setting by taking inspiration from Blackwells approachability. Our framework allows for non-linear aggregation of preferences across criteria, and generalizes the linearization-based approach from multi-objective optimization. From a theoretical standpoint, we show that the Blackwell winner of a multi-criteria problem instance can be computed as the solution to a convex optimization problem. Furthermore, given random samples of pairwise comparisons, we show that a simple plug-in estimator achieves near-optimal minimax sample complexity. Finally, we showcase the practical utility of our framework in a user study on autonomous driving, where we find that the Blackwell winner outperforms the von Neumann winner for the overall preferences.
Recent advances in the blockchain research have been made in two important directions. One is refined resilience analysis utilizing game theory to study the consequences of selfish behaviors of users (miners), and the other is the extension from a linear (chain) structure to a non-linear (graphical) structure for performance improvements, such as IOTA and Graphcoin. The first question that comes to peoples minds is what improvements that a blockchain system would see by leveraging these new advances. In this paper, we consider three major metrics for a blockchain system: full verification, scalability, and finality-duration. We { establish a formal framework and} prove that no blockchain system can achieve full verification, high scalability, and low finality-duration simultaneously. We observe that classical blockchain systems like Bitcoin achieves full verification and low finality-duration, Harmony and Ethereum 2.0 achieve low finality-duration and high scalability. As a complementary, we design a non-linear blockchain system that achieves full verification and scalability. We also establish, for the first time, the trade-off between scalability and finality-duration.
Research in adversarial learning follows a cat and mouse game between attackers and defenders where attacks are proposed, they are mitigated by new defenses, and subsequently new attacks are proposed that break earlier defenses, and so on. However, it has remained unclear as to whether there are conditions under which no better attacks or defenses can be proposed. In this paper, we propose a game-theoretic framework for studying attacks and defenses which exist in equilibrium. Under a locally linear decision boundary model for the underlying binary classifier, we prove that the Fast Gradient Method attack and the Randomized Smoothing defense form a Nash Equilibrium. We then show how this equilibrium defense can be approximated given finitely many samples from a data-generating distribution, and derive a generalization bound for the performance of our approximation.
Model-based reinforcement learning (MBRL) has recently gained immense interest due to its potential for sample efficiency and ability to incorporate off-policy data. However, designing stable and efficient MBRL algorithms using rich function approximators have remained challenging. To help expose the practical challenges in MBRL and simplify algorithm design from the lens of abstraction, we develop a new framework that casts MBRL as a game between: (1) a policy player, which attempts to maximize rewards under the learned model; (2) a model player, which attempts to fit the real-world data collected by the policy player. For algorithm development, we construct a Stackelberg game between the two players, and show that it can be solved with approximate bi-level optimization. This gives rise to two natural families of algorithms for MBRL based on which player is chosen as the leader in the Stackelberg game. Together, they encapsulate, unify, and generalize many previous MBRL algorithms. Furthermore, our framework is consistent with and provides a clear basis for heuristics known to be important in practice from prior works. Finally, through experiments we validate that our proposed algorithms are highly sample efficient, match the asymptotic performance of model-free policy gradient, and scale gracefully to high-dimensional tasks like dexterous hand manipulation. Additional details and code can be obtained from the project page at https://sites.google.com/view/mbrl-game
Learning controllable and generalizable representation of multivariate data with desired structural properties remains a fundamental problem in machine learning. In this paper, we present a novel framework for learning generative models with various underlying structures in the latent space. We represent the inductive bias in the form of mask variables to model the dependency structure in the graphical model and extend the theory of multivariate information bottleneck to enforce it. Our model provides a principled approach to learn a set of semantically meaningful latent factors that reflect various types of desired structures like capturing correlation or encoding invariance, while also offering the flexibility to automatically estimate the dependency structure from data. We show that our framework unifies many existing generative models and can be applied to a variety of tasks including multi-modal data modeling, algorithmic fairness, and invariant risk minimization.
Energy game-theoretic frameworks have emerged to be a successful strategy to encourage energy efficient behavior in large scale by leveraging human-in-the-loop strategy. A number of such frameworks have been introduced over the years which formulate the energy saving process as a competitive game with appropriate incentives for energy efficient players. However, prior works involve an incentive design mechanism which is dependent on knowledge of utility functions for all the players in the game, which is hard to compute especially when the number of players is high, common in energy game-theoretic frameworks. Our research proposes that the utilities of players in such a framework can be grouped together to a relatively small number of clusters, and the clusters can then be targeted with tailored incentives. The key to above segmentation analysis is to learn the features leading to human decision making towards energy usage in competitive environments. We propose a novel graphical lasso based approach to perform such segmentation, by studying the feature correlations in a real-world energy social game dataset. To further improve the explainability of the model, we perform causality study using grangers causality. Proposed segmentation analysis results in characteristic clusters demonstrating different energy usage behaviors. We also present avenues to implement intelligent incentive design using proposed segmentation method.