No Arabic abstract
In practice, most mechanisms for selling, buying, matching, voting, and so on are not incentive compatible. We present techniques for estimating how far a mechanism is from incentive compatible. Given samples from the agents type distribution, we show how to estimate the extent to which an agent can improve his utility by misreporting his type. We do so by first measuring the maximum utility an agent can gain by misreporting his type on average over the samples, assuming his true and reported types are from a finite subset---which our technique constructs---of the type space. The challenge is that by measuring utility gains over a finite subset of the type space, we might miss type pairs $theta$ and $hat{theta}$ where an agent with type $theta$ can greatly improve his utility by reporting type $hat{theta}$. Indeed, our primary technical contribution is proving that the maximum utility gain over this finite subset nearly matches the maximum utility gain overall, despite the volatility of the utility functions we study. We apply our tools to the single-item and combinatorial first-price auctions, generalized second-price auction, discriminatory auction, uniform-price auction, and second-price auction with spiteful bidders.
In this paper we investigate the problem of measuring end-to-end Incentive Compatibility (IC) regret given black-box access to an auction mechanism. Our goal is to 1) compute an estimate for IC regret in an auction, 2) provide a measure of certainty around the estimate of IC regret, and 3) minimize the time it takes to arrive at an accurate estimate. We consider two main problems, with different informational assumptions: In the emph{advertiser problem} the goal is to measure IC regret for some known valuation $v$, while in the more general emph{demand-side platform (DSP) problem} we wish to determine the worst-case IC regret over all possible valuations. The problems are naturally phrased in an online learning model and we design $Regret-UCB$ algorithms for both problems. We give an online learning algorithm where for the advertiser problem the error of determining IC shrinks as $OBig(frac{|B|}{T}cdotBig(frac{ln T}{n} + sqrt{frac{ln T}{n}}Big)Big)$ (where $B$ is the finite set of bids, $T$ is the number of time steps, and $n$ is number of auctions per time step), and for the DSP problem it shrinks as $OBig(frac{|B|}{T}cdotBig( frac{|B|ln T}{n} + sqrt{frac{|B|ln T}{n}}Big)Big)$. For the DSP problem, we also consider stronger IC regret estimation and extend our $Regret-UCB$ algorithm to achieve better IC regret error. We validate the theoretical results using simulations with Generalized Second Price (GSP) auctions, which are known to not be incentive compatible and thus have strictly positive IC regret.
Individual decision-makers consume information revealed by the previous decision makers, and produce information that may help in future decisions. This phenomenon is common in a wide range of scenarios in the Internet economy, as well as in other domains such as medical decisions. Each decision-maker would individually prefer to exploit: select an action with the highest expected reward given her current information. At the same time, each decision-maker would prefer previous decision-makers to explore, producing information about the rewards of various actions. A social planner, by means of carefully designed information disclosure, can incentivize the agents to balance the exploration and exploitation so as to maximize social welfare. We formulate this problem as a multi-armed bandit problem (and various generalizations thereof) under incentive-compatibility constraints induced by the agents Bayesian priors. We design an incentive-compatible bandit algorithm for the social planner whose regret is asymptotically optimal among all bandit algorithms (incentive-compatible or not). Further, we provide a black-box reduction from an arbitrary multi-arm bandit algorithm to an incentive-compatible one, with only a constant multiplicative increase in regret. This reduction works for very general bandit setting that incorporate contexts and arbitrary auxiliary feedback.
Motivated by applications such as college admission and insurance rate determination, we propose an evaluation problem where the inputs are controlled by strategic individuals who can modify their features at a cost. A learner can only partially observe the features, and aims to classify individuals with respect to a quality score. The goal is to design an evaluation mechanism that maximizes the overall quality score, i.e., welfare, in the population, taking any strategic updating into account. We further study the algorithmic aspect of finding the welfare maximizing evaluation mechanism under two specific settings in our model. When scores are linear and mechanisms use linear scoring rules on the observable features, we show that the optimal evaluation mechanism is an appropriate projection of the quality score. When mechanisms must use linear thresholds, we design a polynomial time algorithm with a (1/4)-approximation guarantee when the underlying feature distribution is sufficiently smooth and admits an oracle for finding dense regions. We extend our results to settings where the prior distribution is unknown and must be learned from samples.
Selecting the most influential agent in a network has huge practical value in applications. However, in many scenarios, the graph structure can only be known from agents reports on their connections. In a self-interested setting, agents may strategically hide some connections to make themselves seem to be more important. In this paper, we study the incentive compatible (IC) selection mechanism to prevent such manipulations. Specifically, we model the progeny of an agent as her influence power, i.e., the number of nodes in the subgraph rooted at her. We then propose the Geometric Mechanism, which selects an agent with at least 1/2 of the optimal progeny in expectation under the properties of incentive compatibility and fairness. Fairness requires that two roots with the same contribution in two graphs are assigned the same probability. Furthermore, we prove an upper bound of 1/(1+ln 2) for any incentive compatible and fair selection mechanisms.
Miners in a blockchain system are suffering from ever-increasing storage costs, which in general have not been properly compensated by the users transaction fees. This reduces the incentives for the miners participation and may jeopardize the blockchain security. We propose to mitigate this blockchain insufficient fee issue through a Fee and Waiting Tax (FWT) mechanism, which explicitly considers the two types of negative externalities in the system. Specifically, we model the interactions between the protocol designer, users, and miners as a three-stage Stackelberg game. By characterizing the equilibrium of the game, we find that miners neglecting the negative externality in transaction selection cause they are willing to accept insufficient-fee transactions. This leads to the insufficient storage fee issue in the existing protocol. Moreover, our proposed optimal FWT mechanism can motivate users to pay sufficient transaction fees to cover the storage costs and achieve the unconstrained social optimum. Numerical results show that the optimal FWT mechanism guarantees sufficient transaction fees and achieves an average social welfare improvement of 33.73% or more over the existing protocol. Furthermore, the optimal FWT mechanism achieves the maximum fairness index and performs well even under heterogeneous-storage-cost miners.