Online Learning for Measuring Incentive Compatibility in Ad Auctions

64 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Zhe Feng

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Zhe Feng - Okke Schrijvers - Eric Sodomka

علوم الكمبيوتر ونظرية الألعاب التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this paper we investigate the problem of measuring end-to-end Incentive Compatibility (IC) regret given black-box access to an auction mechanism. Our goal is to 1) compute an estimate for IC regret in an auction, 2) provide a measure of certainty around the estimate of IC regret, and 3) minimize the time it takes to arrive at an accurate estimate. We consider two main problems, with different informational assumptions: In the emph{advertiser problem} the goal is to measure IC regret for some known valuation $v$, while in the more general emph{demand-side platform (DSP) problem} we wish to determine the worst-case IC regret over all possible valuations. The problems are naturally phrased in an online learning model and we design $Regret-UCB$ algorithms for both problems. We give an online learning algorithm where for the advertiser problem the error of determining IC shrinks as $OBig(frac{|B|}{T}cdotBig(frac{ln T}{n} + sqrt{frac{ln T}{n}}Big)Big)$ (where $B$ is the finite set of bids, $T$ is the number of time steps, and $n$ is number of auctions per time step), and for the DSP problem it shrinks as $OBig(frac{|B|}{T}cdotBig( frac{|B|ln T}{n} + sqrt{frac{|B|ln T}{n}}Big)Big)$. For the DSP problem, we also consider stronger IC regret estimation and extend our $Regret-UCB$ algorithm to achieve better IC regret error. We validate the theoretical results using simulations with Generalized Second Price (GSP) auctions, which are known to not be incentive compatible and thus have strictly positive IC regret.

قيم البحث

57 - Maria-Florina Balcan , Tuomas Sandholm , 2019

In practice, most mechanisms for selling, buying, matching, voting, and so on are not incentive compatible. We present techniques for estimating how far a mechanism is from incentive compatible. Given samples from the agents type distribution, we sho w how to estimate the extent to which an agent can improve his utility by misreporting his type. We do so by first measuring the maximum utility an agent can gain by misreporting his type on average over the samples, assuming his true and reported types are from a finite subset---which our technique constructs---of the type space. The challenge is that by measuring utility gains over a finite subset of the type space, we might miss type pairs $theta$ and $hat{theta}$ where an agent with type $theta$ can greatly improve his utility by reporting type $hat{theta}$. Indeed, our primary technical contribution is proving that the maximum utility gain over this finite subset nearly matches the maximum utility gain overall, despite the volatility of the utility functions we study. We apply our tools to the single-item and combinatorial first-price auctions, generalized second-price auction, discriminatory auction, uniform-price auction, and second-price auction with spiteful bidders.

علوم الكمبيوتر ونظرية الألعاب

Targeting and Signaling in Ad Auctions

122 - Ashwinkumar Badanidiyuru , Kshipra Bhawalkar , Haifeng Xu 2017

Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform. In this paper, we examine the following basic question in the context of second-p rice ad auctions: how should an ad platform optimally reveal information about the ad opportunity to the advertisers in order to maximize revenue? We consider a model in which bidders valuations depend on a random state of the ad opportunity. Different from previous work, we focus on a more practical, and challenging, situation where the space of possible realizations of ad opportunities is extremely large. We thus focus on developing algorithms whose running time is independent of the number of ad opportunity realizations. We examine the auctioneers algorithmic question of designing the optimal signaling scheme. When the auctioneer is restricted to send a public signal to all bidders, we focus on a well-motivated Bayesian valuation setting in which the auctioneer and bidders both have private information, and present two main results: 1. we exhibit a characterization result regarding approximately optimal schemes and prove that any constant-approximate public signaling scheme must use exponentially many signals; 2. we present a simple public signaling scheme that serves as a constant approximation under mild assumptions. We then initiate an exploration on the power of being able to send different signals privately to different bidders. Here we examine a basic setting where the auctioneer knows bidders valuations, and exhibit a polynomial-time private scheme that extracts almost full surplus even in the worst Bayes Nash equilibrium. This illustrates the surprising power of private signaling schemes in extracting revenue.

علوم الكمبيوتر ونظرية الألعاب

Equilibria in Auctions With Ad Types

97 - Hadi Elzayn , Riccardo Colini-Baldeschi , Brian Lan 2021

This paper studies equilibrium quality of semi-separable position auctions (known as the Ad Types setting) with greedy or optimal allocation combined with generalized second-price (GSP) or Vickrey-Clarke-Groves (VCG) pricing. We make three contributi ons: first, we give upper and lower bounds on the Price of Anarchy (PoA) for auctions which use greedy allocation with GSP pricing, greedy allocations with VCG pricing, and optimal allocation with GSP pricing. Second, we give Bayes-Nash equilibrium characterizations for two-player, two-slot instances (for all auction formats) and show that there exists both a revenue hierarchy and revenue equivalence across some formats. Finally, we use no-regret learning algorithms and bidding data from a large online advertising platform and no-regret learning algorithms to evaluate the performance of the mechanisms under semi-realistic conditions. For welfare, we find that the optimal-to-realized welfare ratio (an empirical PoA analogue) is broadly better than our upper bounds on PoA; For revenue, we find that the hierarchy in practice may sometimes agree with simple theory, but generally appears sensitive to the underlying distribution of bidder valuations.

علوم الكمبيوتر ونظرية الألعاب

Behavior-Based online Incentive Mechanism for Crowd Sensing with Budget Constraints

485 - Jiajun Sun 2013

Crowd sensing is a new paradigm which leverages the ubiquity of sensor-equipped mobile devices to collect data. To achieve good quality for crowd sensing, incentive mechanisms are indispensable to attract more participants. Most of existing mechanism s focus on the expected utility prior to sensing, ignoring the risk of low quality solution and privacy leakage. Traditional incentive mechanisms such as the Vickrey-Clarke-Groves (VCG) mechanism and its variants are not applicable here. In this paper, to address these challenges, we propose a behavior based incentive mechanism for crowd sensing applications with budget constraints by applying sequential all-pay auctions in mobile social networks (MSNs), not only to consider the effects of extensive user participation, but also to maximize high quality of the context based sensing content submission for crowd sensing platform under the budget constraints, where users arrive in a sequential order. Through an extensive simulation, results indicate that incentive mechanisms in our proposed framework outperform the best existing solution.

علوم الكمبيوتر ونظرية الألعاب بنية الشبكات والإنترنت

Incentive Mechanism Design for Distributed Coded Machine Learning

225 - Ningning Ding , Zhixuan Fang , Lingjie Duan 2020

A distributed machine learning platform needs to recruit many heterogeneous worker nodes to finish computation simultaneously. As a result, the overall performance may be degraded due to straggling workers. By introducing redundancy into computation, coded machine learning can effectively improve the runtime performance by recovering the final computation result through the first $k$ (out of the total $n$) workers who finish computation. While existing studies focus on designing efficient coding schemes, the issue of designing proper incentives to encourage worker participation is still under-explored. This paper studies the platforms optimal incentive mechanism for motivating proper workers participation in coded machine learning, despite the incomplete information about heterogeneous workers computation performances and costs. A key contribution of this work is to summarize workers multi-dimensional heterogeneity as a one-dimensional metric, which guides the platforms efficient selection of workers under incomplete information with a linear computation complexity. Moreover, we prove that the optimal recovery threshold $k$ is linearly proportional to the participator number $n$ if we use the widely adopted MDS (Maximum Distance Separable) codes for data encoding. We also show that the platforms increased cost due to incomplete information disappears when worker number is sufficiently large, but it does not monotonically decrease in worker number.

علوم الكمبيوتر ونظرية الألعاب