ﻻ يوجد ملخص باللغة العربية
Modern online advertising systems inevitably rely on personalization methods, such as click-through rate (CTR) prediction. Recent progress in CTR prediction enjoys the rich representation capabilities of deep learning and achieves great success in large-scale industrial applications. However, these methods can suffer from lack of exploration. Another line of prior work addresses the exploration-exploitation trade-off problem with contextual bandit methods, which are recently less studied in the industry due to the difficulty in extending their flexibility with deep models. In this paper, we propose a novel Deep Uncertainty-Aware Learning (DUAL) method to learn CTR models based on Gaussian processes, which can provide predictive uncertainty estimations while maintaining the flexibility of deep neural networks. DUAL can be easily implemented on existing models and deployed in real-time systems with minimal extra computational overhead. By linking the predictive uncertainty estimation ability of DUAL to well-known bandit algorithms, we further present DUAL-based Ad-ranking strategies to boost up long-term utilities such as the social welfare in advertising systems. Experimental results on several public datasets demonstrate the effectiveness of our methods. Remarkably, an online A/B test deployed in the Alibaba display advertising platform shows an 8.2% social welfare improvement and an 8.0% revenue lift.
With the recent prevalence of Reinforcement Learning (RL), there have been tremendous interests in utilizing RL for online advertising in recommendation platforms (e.g., e-commerce and news feed sites). However, most RL-based advertising algorithms f
Search, recommendation, and online advertising are the three most important information-providing mechanisms on the web. These information seeking techniques, satisfying users information needs by suggesting users personalized objects (information or
A significant remaining challenge for existing recommender systems is that users may not trust the recommender systems for either lack of explanation or inaccurate recommendation results. Thus, it becomes critical to embrace a trustworthy recommender
Advertising expenditures have become the major source of revenue for e-commerce platforms. Providing good advertising experiences for advertisers by reducing their costs of trial and error in discovering the optimal advertising strategies is crucial
Online learning to rank (OL2R) optimizes the utility of returned search results based on implicit feedback gathered directly from users. To improve the estimates, OL2R algorithms examine one or more exploratory gradient directions and update the curr