Reinforcement Mechanism Design, with Applications to Dynamic Pricing in Sponsored Search Auctions

291 0 0.0 ( 0 )

Download Cite

Added by Weiran Shen

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Weiran Shen - Binghui Peng - Hanpeng Liu

Computer Science and Game Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this study, we apply reinforcement learning techniques and propose what we call reinforcement mechanism design to tackle the dynamic pricing problem in sponsored search auctions. In contrast to previous game-theoretical approaches that heavily rely on rationality and common knowledge among the bidders, we take a data-driven approach, and try to learn, over repeated interactions, the set of optimal reserve prices. We implement our approach within the current sponsored search framework of a major search engine: we first train a buyer behavior model, via a real bidding data set, that accurately predicts bids given information that bidders are aware of, including the game parameters disclosed by the search engine, as well as the bidders KPI data from previous rounds. We then put forward a reinforcement/MDP (Markov Decision Process) based algorithm that optimizes reserve prices over time, in a GSP-like auction. Our simulations demonstrate that our framework outperforms static optimization strategies including the ones that are currently in use, as well as several other dynamic ones.

rate research

Learning Fair Equilibria in Sponsored Search Auctions

168 - Andrea Celli , Riccardo Colini-Baldeschi , Stefano Leonardi 2021

In this work we investigate the strategic learning implications of the deployment of sponsored search auction mechanisms that obey to fairness criteria. We introduce a new class of mechanisms composing a traditional Generalized Second Price auction (GSP) with different fair division schemes to achieve some desired level of fairness between two groups of Bayesian strategic advertisers. We propose two mechanisms, $beta$-Fair GSP and GSP-EFX, that compose GSP with, respectively, an envy-free up to one item (EF1), and an envy-free up to any item (EFX) fair division scheme. The payments of GSP are adjusted in order to compensate the advertisers that suffer a loss of efficiency due the fair division stage. We prove that, for both mechanisms, if bidders play so as to minimize their external regret they are guaranteed to reach an equilibrium with good social welfare. We also prove that the mechanisms are budget balanced, so that the payments charged by the traditional GSP mechanism are a good proxy of the total compensation offered to the advertisers. Finally, we evaluate the quality of the allocations of the two mechanisms through experiments on real-world data.

Computer Science and Game Theory

Towards Better Models of Externalities in Sponsored Search Auctions

120 - Nicola Gatti , Marco Rocco , Paolo Serafino 2016

Sponsored Search Auctions (SSAs) arguably represent the problem at the intersection of computer science and economics with the deepest applications in real life. Within the realm of SSAs, the study of the effects that showing one ad has on the other ads, a.k.a. externalities in economics, is of utmost importance and has so far attracted the attention of much research. However, even the basic question of modeling the problem has so far escaped a definitive answer. The popular cascade model is arguably too idealized to really describe the phenomenon yet it allows a good comprehension of the problem. Other models, instead, describe the setting more adequately but are too complex to permit a satisfactory theoretical analysis. In this work, we attempt to get the best of both approaches: firstly, we define a number of general mathematical formulations for the problem in the attempt to have a rich description of externalities in SSAs and, secondly, prove a host of results drawing a nearly complete picture about the computational complexity of the problem. We complement these approximability results with some considerations about mechanism design in our context.

Computer Science and Game Theory

Revenue-Maximizing Mechanism Design for Quasi-Proportional Auctions

542 - Zheng Wen , Eric Bax , James Li 2015

In quasi-proportional auctions, each bidder receives a fraction of the allocation equal to the weight of their bid divided by the sum of weights of all bids, where each bids weight is determined by a weight function. We study the relationship between the weight function, bidders private values, number of bidders, and the sellers revenue in equilibrium. It has been shown that if one bidder has a much higher private value than the others, then a nearly flat weight function maximizes revenue. Essentially, threatening the bidder who has the highest valuation with having to share the allocation maximizes the revenue. We show that as bidder private values approach parity, steeper weight functions maximize revenue by making the quasi-proportional auction more like a winner-take-all auction. We also show that steeper weight functions maximize revenue as the number of bidders increases. For flatter weight functions, there is known to be a unique pure-strategy Nash equilibrium. We show that a pure-strategy Nash equilibrium also exists for steeper weight functions, and we give lower bounds for bids at an equilibrium. For a special case that includes the two-bidder auction, we show that the pure-strategy Nash equilibrium is unique, and we show how to compute the revenue at equilibrium. We also show that selecting a weight function based on private value ratios and number of bidders is necessary for a quasi-proportional auction to produce more revenue than a second-price auction.

Computer Science and Game Theory

Automated Dynamic Mechanism Design

89 - Hanrui Zhang , Vincent Conitzer 2021

We study Bayesian automated mechanism design in unstructured dynamic environments, where a principal repeatedly interacts with an agent, and takes actions based on the strategic agents report of the current state of the world. Both the principal and the agent can have arbitrary and potentially different valuations for the actions taken, possibly also depending on the actual state of the world. Moreover, at any time, the state of the world may evolve arbitrarily depending on the action taken by the principal. The goal is to compute an optimal mechanism which maximizes the principals utility in the face of the self-interested strategic agent. We give an efficient algorithm for computing optimal mechanisms, with or without payments, under different individual-rationality constraints, when the time horizon is constant. Our algorithm is based on a sophisticated linear program formulation, which can be customized in various ways to accommodate richer constraints. For environments with large time horizons, we show that the principals optimal utility is hard to approximate within a certain constant factor, complementing our algorithmic result. We further consider a special case of the problem where the agent is myopic, and give a refined efficient algorithm whose time complexity scales linearly in the time horizon. Moreover, we show that memoryless mechanisms do not provide a good solution for our problem, in terms of both optimality and computational tractability. These results paint a relatively complete picture for automated dynamic mechanism design in unstructured environments. Finally, we present experimental results where our algorithms are applied to synthetic dynamic environments with different characteristics, which not only serve as a proof of concept for our algorithms, but also exhibit intriguing phenomena in dynamic mechanism design.

Computer Science and Game Theory

Prior-free Dynamic Mechanism Design With Limited Liability

108 - Mark Braverman , Jon Schneider , S. Matthew Weinberg 2021

We study the problem of repeatedly auctioning off an item to one of $k$ bidders where: a) bidders have a per-round individual rationality constraint, b) bidders may leave the mechanism at any point, and c) the bidders valuations are adversarially chosen (the prior-free setting). Without these constraints, the auctioneer can run a second-price auction to sell the business and receive the second highest total value for the entire stream of items. We show that under these constraints, the auctioneer can attain a constant fraction of the sell the business benchmark, but no more than $2/e$ of this benchmark. In the course of doing so, we design mechanisms for a single bidder problem of independent interest: how should you repeatedly sell an item to a (per-round IR) buyer with adversarial valuations if you know their total value over all rounds is $V$ but not how their value changes over time? We demonstrate a mechanism that achieves revenue $V/e$ and show that this is tight.

Computer Science and Game Theory Theoretical Economics