No Arabic abstract
Prediction is a well-studied machine learning task, and prediction algorithms are core ingredients in online products and services. Despite their centrality in the competition between online companies who offer prediction-based products, the strategic use of prediction algorithms remains unexplored. The goal of this paper is to examine strategic use of prediction algorithms. We introduce a novel game-theoretic setting that is based on the PAC learning framework, where each player (aka a prediction algorithm at competition) seeks to maximize the sum of points for which it produces an accurate prediction and the others do not. We show that algorithms aiming at generalization may wittingly miss-predict some points to perform better than others on expectation. We analyze the empirical game, i.e. the game induced on a given sample, prove that it always possesses a pure Nash equilibrium, and show that every better-response learning process converges. Moreover, our learning-theoretic analysis suggests that players can, with high probability, learn an approximate pure Nash equilibrium for the whole population using a small number of samples.
We present a way to analyze the distribution produced by a Monte Carlo algorithm. We perform these analyses on sever
Most modern systems strive to learn from interactions with users, and many engage in exploration: making potentially suboptimal choices for the sake of acquiring new information. We initiate a study of the interplay between exploration and competition--how such systems balance the exploration for learning and the competition for users. Here the users play three distinct roles: they are customers that generate revenue, they are sources of data for learning, and they are self-interested agents which choose among the competing systems. In our model, we consider competition between two multi-armed bandit algorithms faced with the same bandit instance. Users arrive one by one and choose among the two algorithms, so that each algorithm makes progress if and only if it is chosen. We ask whether and to what extent competition incentivizes the adoption of better bandit algorithms. We investigate this issue for several models of user response, as we vary the degree of rationality and competitiveness in the model. Our findings are closely related to the competition vs. innovation relationship, a well-studied theme in economics.
We introduce a model of competing agents in a prophet setting, where rewards arrive online, and decisions are made immediately and irrevocably. The rewards are unknown from the outset, but they are drawn from a known probability distribution. In the standard prophet setting, a single agent makes selection decisions in an attempt to maximize her expected reward. The novelty of our model is the introduction of a competition setting, where multiple agents compete over the arriving rewards, and make online selection decisions simultaneously, as rewards arrive. If a given reward is selected by more than a single agent, ties are broken either randomly or by a fixed ranking of the agents. The consideration of competition turns the prophet setting from an online decision making scenario to a multi-agent game. For both random and ranked tie-breaking rules, we present simple threshold strategies for the agents that give them high guarantees, independent of the strategies taken by others. In particular, for random tie-breaking, every agent can guarantee herself at least $frac{1}{k+1}$ of the highest reward, and at least $frac{1}{2k}$ of the optimal social welfare. For ranked tie-breaking, the $i$th ranked agent can guarantee herself at least a half of the $i$th highest reward. We complement these results by matching upper bounds, even with respect to equilibrium profiles. For ranked tie-breaking rule, we also show a correspondence between the equilibrium of the $k$-agent game and the optimal strategy of a single decision maker who can select up to $k$ rewards.
The Bitcoin protocol prescribes certain behavior by the miners who are responsible for maintaining and extending the underlying blockchain; in particular, miners who successfully solve a puzzle, and hence can extend the chain by a block, are supposed to release that block immediately. Eyal and Sirer showed, however, that a selfish miner is incentivized to deviate from the protocol and withhold its blocks under certain conditions. The analysis by Eyal and Sirer, as well as in followup work, considers a emph{single} deviating miner (who may control a large fraction of the hashing power in the network) interacting with a remaining pool of honest miners. Here, we extend this analysis to the case where there are emph{multiple} (non-colluding) selfish miners. We find that with multiple strategic miners, specific deviations from honest mining by multiple strategic agents can outperform honest mining, even if individually miners would not be incentivised to be dishonest. This previous point effectively renders the Bitcoin protocol to be less secure than previously thought.
Most online platforms strive to learn from interactions with users, and many engage in exploration: making potentially suboptimal choices for the sake of acquiring new information. We study the interplay between exploration and competition: how such platforms balance the exploration for learning and the competition for users. Here users play three distinct roles: they are customers that generate revenue, they are sources of data for learning, and they are self-interested agents which choose among the competing platforms. We consider a stylized duopoly model in which two firms face the same multi-armed bandit problem. Users arrive one by one and choose between the two firms, so that each firm makes progress on its bandit problem only if it is chosen. Through a mix of theoretical results and numerical simulations, we study whether and to what extent competition incentivizes the adoption of better bandit algorithms, and whether it leads to welfare increases for users. We find that stark competition induces firms to commit to a greedy bandit algorithm that leads to low welfare. However, weakening competition by providing firms with some free users incentivizes better exploration strategies and increases welfare. We investigate two channels for weakening the competition: relaxing the rationality of users and giving one firm a first-mover advantage. Our findings are closely related to the competition vs. innovation relationship, and elucidate the first-mover advantage in the digital economy.