ترغب بنشر مسار تعليمي؟ اضغط هنا

Adaptive Keywords Extraction with Contextual Bandits for Advertising on Parked Domains

175   0   0.0 ( 0 )
 نشر من قبل Shuai Yuan
 تاريخ النشر 2013
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Domain name registrars and URL shortener service providers place advertisements on the parked domains (Internet domain names which are not in service) in order to generate profits. As the web contents have been removed, it is critical to make sure the displayed ads are directly related to the intents of the visitors who have been directed to the parked domains. Because of the missing contents in these domains, it is non-trivial to generate the keywords to describe the previous contents and therefore the users intents. In this paper we discuss the adaptive keywords extraction problem and introduce an algorithm based on the BM25F term weighting and linear multi-armed bandits. We built a prototype over a production domain registration system and evaluated it using crowdsourcing in multiple iterations. The prototype is compared with other popular methods and is shown to be more effective.

قيم البحث

اقرأ أيضاً

In this paper, we consider online learning in generalized linear contextual bandits where rewards are not immediately observed. Instead, rewards are available to the decision-maker only after some delay, which is unknown and stochastic. We study the performance of two well-known algorithms adapted to this delayed setting: one based on upper confidence bounds, and the other based on Thompson sampling. We describe modifications on how these two algorithms should be adapted to handle delays and give regret characterizations for both algorithms. Our results contribute to the broad landscape of contextual bandits literature by establishing that both algorithms can be made to be robust to delays, thereby helping clarify and reaffirm the empirical success of these two algorithms, which are widely deployed in modern recommendation engines.
51 - Sourav Dutta , Haytham Assem , 2021
Frequently-Asked-Question (FAQ) retrieval provides an effective procedure for responding to users natural language based queries. Such platforms are becoming common in enterprise chatbots, product question answering, and preliminary technical support for customers. However, the challenge in such scenarios lies in bridging the lexical and semantic gap between varied query formulations and the corresponding answers, both of which typically have a very short span. This paper proposes TI-S2S, a novel learning framework combining TF-IDF based keyword extraction and Word2Vec embeddings for training a Sequence-to-Sequence (Seq2Seq) architecture. It achieves high precision for FAQ retrieval by better understanding the underlying intent of a user question captured via the representative keywords. We further propose a variant with an additional neural network module for guiding retrieval via relevant candidate identification based on similarity features. Experiments on publicly available dataset depict our approaches to provide around 92% precision-at-rank-5, exhibiting nearly 13% improvement over existing approaches.
We consider the linear contextual bandit problem with resource consumption, in addition to reward generation. In each round, the outcome of pulling an arm is a reward as well as a vector of resource consumptions. The expected values of these outcomes depend linearly on the context of that arm. The budget/capacity constraints require that the total consumption doesnt exceed the budget for each resource. The objective is once again to maximize the total reward. This problem turns out to be a common generalization of classic linear contextual bandits (linContextual), bandits with knapsacks (BwK), and the online stochastic packing problem (OSPP). We present algorithms with near-optimal regret bounds for this problem. Our bounds compare favorably to results on the unstructured version of the problem where the relation between the contexts and the outcomes could be arbitrary, but the algorithm only competes against a fixed set of policies accessible through an optimization oracle. We combine techniques from the work on linContextual, BwK, and OSPP in a nontrivial manner while also tackling new difficulties that are not present in any of these special cases.
We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for solving these problems that handles constrained resou rces other than time, and improves over a trivial reduction to the non-contextual case. We consider very general settings for both contextual bandits (arbitrary policy sets, e.g. Dudik et al. (UAI11)) and bandits with resource constraints (bandits with knapsacks, Badanidiyuru et al. (FOCS13)), and prove a regret guarantee with near-optimal statistical properties.
We consider the problem of learning to choose actions using contextual information when provided with limited feedback in the form of relative pairwise comparisons. We study this problem in the dueling-bandits framework of Yue et al. (2009), which we extend to incorporate context. Roughly, the learners goal is to find the best policy, or way of behaving, in some space of policies, although best is not always so clearly defined. Here, we propose a new and natural solution concept, rooted in game theory, called a von Neumann winner, a randomized policy that beats or ties every other policy. We show that this notion overcomes important limitations of existing solutions, particularly the Condorcet winner which has typically been used in the past, but which requires strong and often unrealistic assumptions. We then present three efficient algorithms for online learning in our setting, and for approximating a von Neumann winner from batch-like data. The first of these algorithms achieves particularly low regret, even when data is adversarial, although its time and space requirements are linear in the size of the policy space. The other two algorithms require time and space only logarithmic in the size of the policy space when provided access to an oracle for solving classification problems on the space.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا