ترغب بنشر مسار تعليمي؟ اضغط هنا

Exploration via design and the cost of uncertainty in keyword auctions

115   0   0.0 ( 0 )
 نشر من قبل Sudhir Kumar Singh
 تاريخ النشر 2007
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We present a deterministic exploration mechanism for sponsored search auctions, which enables the auctioneer to learn the relevance scores of advertisers, and allows advertisers to estimate the true value of clicks generated at the auction site. This exploratory mechanism deviates only minimally from the mechanism being currently used by Google and Yahoo! in the sense that it retains the same pricing rule, similar ranking scheme, as well as, similar mathematical structure of payoffs. In particular, the estimations of the relevance scores and true-values are achieved by providing a chance to lower ranked advertisers to obtain better slots. This allows the search engine to potentially test a new pool of advertisers, and correspondingly, enables new advertisers to estimate the value of clicks/leads generated via the auction. Both these quantities are unknown a priori, and their knowledge is necessary for the auction to operate efficiently. We show that such an exploration policy can be incorporated without any significant loss in revenue for the auctioneer. We compare the revenue of the new mechanism to that of the standard mechanism at their corresponding symmetric Nash equilibria and compute the cost of uncertainty, which is defined as the relative loss in expected revenue per impression. We also bound the loss in efficiency, as well as, in user experience due to exploration, under the same solution concept (i.e. SNE). Thus the proposed exploration mechanism learns the relevance scores while incorporating the incentive constraints from the advertisers who are selfish and are trying to maximize their own profits, and therefore, the exploration is essentially achieved via mechanism design. We also discuss variations of the new mechanism such as truthful implementations.



قيم البحث

اقرأ أيضاً

Search auctions have become a dominant source of revenue generation on the Internet. Such auctions have typically used per-click bidding and pricing. We propose the use of hybrid auctions where an advertiser can make a per-impression as well as a per -click bid, and the auctioneer then chooses one of the two as the pricing mechanism. We assume that the advertiser and the auctioneer both have separate beliefs (called priors) on the click-probability of an advertisement. We first prove that the hybrid auction is truthful, assuming that the advertisers are risk-neutral. We then show that this auction is superior to the existing per-click auction in multiple ways: 1) It takes into account the risk characteristics of the advertisers. 2) For obscure keywords, the auctioneer is unlikely to have a very sharp prior on the click-probabilities. In such situations, the hybrid auction can result in significantly higher revenue. 3) An advertiser who believes that its click-probability is much higher than the auctioneers estimate can use per-impression bids to correct the auctioneers prior without incurring any extra cost. 4) The hybrid auction can allow the advertiser and auctioneer to implement complex dynamic programming strategies. As Internet commerce matures, we need more sophisticated pricing models to exploit all the information held by each of the participants. We believe that hybrid auctions could be an important step in this direction.
The problem of exploration in unknown environments continues to pose a challenge for reinforcement learning algorithms, as interactions with the environment are usually expensive or limited. The technique of setting subgoals with an intrinsic reward allows for the use of supplemental feedback to aid agent in environment with sparse and delayed rewards. In fact, it can be an effective tool in directing the exploration behavior of the agent toward useful parts of the state space. In this paper, we consider problems where an agent faces an unknown task in the future and is given prior opportunities to ``practice on related tasks where the interactions are still expensive. We propose a one-step Bayes-optimal algorithm for selecting subgoal designs, along with the number of episodes and the episode length, to efficiently maximize the expected performance of an agent. We demonstrate its excellent performance on a variety of tasks and also prove an asymptotic optimality guarantee.
140 - Benjamin Heymann 2018
A standard result from auction theory is that bidding truthfully in a second price auction is a weakly dominant strategy. The result, however, does not apply in the presence of Cost Per Action (CPA) constraints. Such constraints exist, for instance, in digital advertising, as some buyer may try to maximize the total number of clicks while keeping the empirical Cost Per Click (CPC) below a threshold. More generally the CPA constraint implies that the buyer has a maximal average cost per unit of value in mind. We discuss how such constraints change some traditional results from auction theory. Following the usual textbook narrative on auction theory, we focus specifically on the symmetric setting, We formalize the notion of CPA constrained auctions and derive a Nash equilibrium for second price auctions. We then extend this result to combinations of first and second price auctions. Further, we expose a revenue equivalence property and show that the sellers revenue-maximizing reserve price is zero. In practice, CPA-constrained buyers may target an empirical CPA on a given time horizon, as the auction is repeated many times. Thus his bidding behavior depends on past realization. We show that the resulting buyer dynamic optimization problem can be formalized with stochastic control tools and solved numerically with available solvers.
173 - Zheng Wen , Eric Bax , James Li 2015
In quasi-proportional auctions, each bidder receives a fraction of the allocation equal to the weight of their bid divided by the sum of weights of all bids, where each bids weight is determined by a weight function. We study the relationship between the weight function, bidders private values, number of bidders, and the sellers revenue in equilibrium. It has been shown that if one bidder has a much higher private value than the others, then a nearly flat weight function maximizes revenue. Essentially, threatening the bidder who has the highest valuation with having to share the allocation maximizes the revenue. We show that as bidder private values approach parity, steeper weight functions maximize revenue by making the quasi-proportional auction more like a winner-take-all auction. We also show that steeper weight functions maximize revenue as the number of bidders increases. For flatter weight functions, there is known to be a unique pure-strategy Nash equilibrium. We show that a pure-strategy Nash equilibrium also exists for steeper weight functions, and we give lower bounds for bids at an equilibrium. For a special case that includes the two-bidder auction, we show that the pure-strategy Nash equilibrium is unique, and we show how to compute the revenue at equilibrium. We also show that selecting a weight function based on private value ratios and number of bidders is necessary for a quasi-proportional auction to produce more revenue than a second-price auction.
In this study, we apply reinforcement learning techniques and propose what we call reinforcement mechanism design to tackle the dynamic pricing problem in sponsored search auctions. In contrast to previous game-theoretical approaches that heavily rel y on rationality and common knowledge among the bidders, we take a data-driven approach, and try to learn, over repeated interactions, the set of optimal reserve prices. We implement our approach within the current sponsored search framework of a major search engine: we first train a buyer behavior model, via a real bidding data set, that accurately predicts bids given information that bidders are aware of, including the game parameters disclosed by the search engine, as well as the bidders KPI data from previous rounds. We then put forward a reinforcement/MDP (Markov Decision Process) based algorithm that optimizes reserve prices over time, in a GSP-like auction. Our simulations demonstrate that our framework outperforms static optimization strategies including the ones that are currently in use, as well as several other dynamic ones.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا