Learning to Bid in Contextual First Price Auctions


الملخص بالإنكليزية

In this paper, we investigate the problem about how to bid in repeated contextual first price auctions. We consider a single bidder (learner) who repeatedly bids in the first price auctions: at each time $t$, the learner observes a context $x_tin mathbb{R}^d$ and decides the bid based on historical information and $x_t$. We assume a structured linear model of the maximum bid of all the others $m_t = alpha_0cdot x_t + z_t$, where $alpha_0in mathbb{R}^d$ is unknown to the learner and $z_t$ is randomly sampled from a noise distribution $mathcal{F}$ with log-concave density function $f$. We consider both emph{binary feedback} (the learner can only observe whether she wins or not) and emph{full information feedback} (the learner can observe $m_t$) at the end of each time $t$. For binary feedback, when the noise distribution $mathcal{F}$ is known, we propose a bidding algorithm, by using maximum likelihood estimation (MLE) method to achieve at most $widetilde{O}(sqrt{log(d) T})$ regret. Moreover, we generalize this algorithm to the setting with binary feedback and the noise distribution is unknown but belongs to a parametrized family of distributions. For the full information feedback with emph{unknown} noise distribution, we provide an algorithm that achieves regret at most $widetilde{O}(sqrt{dT})$. Our approach combines an estimator for log-concave density functions and then MLE method to learn the noise distribution $mathcal{F}$ and linear weight $alpha_0$ simultaneously. We also provide a lower bound result such that any bidding policy in a broad class must achieve regret at least $Omega(sqrt{T})$, even when the learner receives the full information feedback and $mathcal{F}$ is known.

تحميل البحث