ﻻ يوجد ملخص باللغة العربية
In this study, we apply reinforcement learning techniques and propose what we call reinforcement mechanism design to tackle the dynamic pricing problem in sponsored search auctions. In contrast to previous game-theoretical approaches that heavily rely on rationality and common knowledge among the bidders, we take a data-driven approach, and try to learn, over repeated interactions, the set of optimal reserve prices. We implement our approach within the current sponsored search framework of a major search engine: we first train a buyer behavior model, via a real bidding data set, that accurately predicts bids given information that bidders are aware of, including the game parameters disclosed by the search engine, as well as the bidders KPI data from previous rounds. We then put forward a reinforcement/MDP (Markov Decision Process) based algorithm that optimizes reserve prices over time, in a GSP-like auction. Our simulations demonstrate that our framework outperforms static optimization strategies including the ones that are currently in use, as well as several other dynamic ones.
In this work we investigate the strategic learning implications of the deployment of sponsored search auction mechanisms that obey to fairness criteria. We introduce a new class of mechanisms composing a traditional Generalized Second Price auction (
Sponsored Search Auctions (SSAs) arguably represent the problem at the intersection of computer science and economics with the deepest applications in real life. Within the realm of SSAs, the study of the effects that showing one ad has on the other
In quasi-proportional auctions, each bidder receives a fraction of the allocation equal to the weight of their bid divided by the sum of weights of all bids, where each bids weight is determined by a weight function. We study the relationship between
We study Bayesian automated mechanism design in unstructured dynamic environments, where a principal repeatedly interacts with an agent, and takes actions based on the strategic agents report of the current state of the world. Both the principal and
We study the problem of repeatedly auctioning off an item to one of $k$ bidders where: a) bidders have a per-round individual rationality constraint, b) bidders may leave the mechanism at any point, and c) the bidders valuations are adversarially cho