Multi-Receiver Online Bayesian Persuasion


الملخص بالإنكليزية

Bayesian persuasion studies how an informed sender should partially disclose information to influence the behavior of a self-interested receiver. Classical models make the stringent assumption that the sender knows the receivers utility. This can be relaxed by considering an online learning framework in which the sender repeatedly faces a receiver of an unknown, adversarially selected type. We study, for the first time, an online Bayesian persuasion setting with multiple receivers. We focus on the case with no externalities and binary actions, as customary in offline models. Our goal is to design no-regret algorithms for the sender with polynomial per-iteration running time. First, we prove a negative result: for any $0 < alpha leq 1$, there is no polynomial-time no-$alpha$-regret algorithm when the senders utility function is supermodular or anonymous. Then, we focus on the case of submodular senders utility functions and we show that, in this case, it is possible to design a polynomial-time no-$(1 - frac{1}{e})$-regret algorithm. To do so, we introduce a general online gradient descent scheme to handle online learning problems with a finite number of possible loss functions. This requires the existence of an approximate projection oracle. We show that, in our setting, there exists one such projection oracle which can be implemented in polynomial time.

تحميل البحث