No Arabic abstract
Bayesian persuasion is the study of information sharing policies among strategic agents. A prime example is signaling in online ad auctions: what information should a platform signal to an advertiser regarding a user when selling the opportunity to advertise to her? Practical considerations such as preventing discrimination, protecting privacy or acknowledging limited attention of the information receiver impose constraints on information sharing. In this work, we propose and analyze a simple way to mathematically model such constraints as restrictions on Receivers admissible posterior beliefs. We consider two families of constraints - ex ante and ex post, where the latter limits each instance of Sender-Receiver communication, while the former more general family can also pose restrictions in expectation. For the ex ante family, Doval and Skreta establish the existence of an optimal signaling scheme with a small number of signals - at most the number of constraints plus the number of states of nature; we show this result is tight and provide an alternative proof for it. For the ex post family, we tighten a bound of V{o}lund, showing that the required number of signals is at most the number of states of nature, as in the original Kamenica-Gentzkow setting. As our main algorithmic result, we provide an additive bi-criteria FPTAS for an optimal constrained signaling scheme assuming a constant number of states; we improve the approximation to single-criteria under a Slater-like regularity condition. The FPTAS holds under standard assumptions; relaxed assumptions yield a PTAS. Finally, we bound the ratio between Senders optimal utility under convex ex ante constraints and the corresponding ex post constraints. This bound applies to finding an approximately welfare-maximizing constrained signaling scheme in ad auctions.
We provide a characterization of revenue-optimal dynamic mechanisms in settings where a monopolist sells k items over k periods to a buyer who realizes his value for item i in the beginning of period i. We require that the mechanism satisfies a strong individual rationality constraint, requiring that the stage utility of each agent be positive during each period. We show that the optimum mechanism can be computed by solving a nested sequence of static (single-period) mechanisms that optimize a tradeoff between the surplus of the allocation and the buyers utility. We also provide a simple dynamic mechanism that obtains at least half of the optimal revenue. The mechanism either ignores history and posts the optimal monopoly price in each period, or allocates with a probability that is independent of the current report of the agent and is based only on previous reports. Our characterization extends to multi-agent auctions. We also formulate a discounted infinite horizon version of the problem, where we study the performance of Markov mechanisms.
We focus on the problem of finding an optimal strategy for a team of two players that faces an opponent in an imperfect-information zero-sum extensive-form game. Team members are not allowed to communicate during play but can coordinate before the game. In that setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game. In this paper, we first provide new modeling results about computing such an optimal distribution by drawing a connection to a different literature on extensive-form correlation. Second, we provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile. We can also cap the number of such profiles we allow in the solution. This begets an anytime algorithm by increasing the cap. We find that often a handful of well-chosen such profiles suffices to reach optimal utility for the team. This enables team members to reach coordination through a relatively simple and understandable plan. Finally, inspired by this observation and leveraging theoretical concepts that we introduce, we develop an efficient column-generation algorithm for finding an optimal distribution for the team. We evaluate it on a suite of common benchmark games. It is three orders of magnitude faster than the prior state of the art on games that the latter can solve and it can also solve several games that were previously unsolvable.
We study a Bayesian persuasion setting with binary actions (adopt and reject) for Receiver. We examine the following question - how well can Sender perform, in terms of persuading Receiver to adopt, when ignorant of Receivers utility? We take a robust (adversarial) approach to study this problem; that is, our goal is to design signaling schemes for Sender that perform well for all possible Receivers utilities. We measure performance of signaling schemes via the notion of (additive) regret: the difference between Senders hypothetically optimal utility had she known Receivers utility function and her actual utility induced by the given scheme. On the negative side, we show that if Sender has no knowledge at all about Receivers utility, then Sender has no signaling scheme that performs robustly well. On the positive side, we show that if Sender only knows Receivers ordinal preferences of the states of nature - i.e., Receivers utility upon adoption is monotonic as a function of the state - then Sender can guarantee a surprisingly low regret even when the number of states tends to infinity. In fact, we exactly pin down the minimum regret value that Sender can guarantee in this case, which turns out to be at most 1/e. We further show that such positive results are not possible under the alternative performance measure of a multiplicative approximation ratio by proving that no constant ratio can be guaranteed even for monotonic Receivers utility; this may serve to demonstrate the merits of regret as a robust performance measure that is not too pessimistic. Finally, we analyze an intermediate setting in between the no-knowledge and the ordinal-knowledge settings.
Bayesian persuasion studies how an informed sender should partially disclose information to influence the behavior of a self-interested receiver. Classical models make the stringent assumption that the sender knows the receivers utility. This can be relaxed by considering an online learning framework in which the sender repeatedly faces a receiver of an unknown, adversarially selected type. We study, for the first time, an online Bayesian persuasion setting with multiple receivers. We focus on the case with no externalities and binary actions, as customary in offline models. Our goal is to design no-regret algorithms for the sender with polynomial per-iteration running time. First, we prove a negative result: for any $0 < alpha leq 1$, there is no polynomial-time no-$alpha$-regret algorithm when the senders utility function is supermodular or anonymous. Then, we focus on the case of submodular senders utility functions and we show that, in this case, it is possible to design a polynomial-time no-$(1 - frac{1}{e})$-regret algorithm. To do so, we introduce a general online gradient descent scheme to handle online learning problems with a finite number of possible loss functions. This requires the existence of an approximate projection oracle. We show that, in our setting, there exists one such projection oracle which can be implemented in polynomial time.
Persuasion studies how an informed principal may influence the behavior of agents by the strategic provision of payoff-relevant information. We focus on the fundamental multi-receiver model by Arieli and Babichenko (2019), in which there are no inter-agent externalities. Unlike prior works on this problem, we study the public persuasion problem in the general setting with: (i) arbitrary state spaces; (ii) arbitrary action spaces; (iii) arbitrary senders utility functions. We fully characterize the computational complexity of computing a bi-criteria approximation of an optimal public signaling scheme. In particular, we show, in a voting setting of independent interest, that solving this problem requires at least a quasi-polynomial number of steps even in settings with a binary action space, assuming the Exponential Time Hypothesis. In doing so, we prove that a relaxed version of the Maximum Feasible Subsystem of Linear Inequalities problem requires at least quasi-polynomial time to be solved. Finally, we close the gap by providing a quasi-polynomial time bi-criteria approximation algorithm for arbitrary public persuasion problems that, in specific settings, yields a QPTAS.