ترغب بنشر مسار تعليمي؟ اضغط هنا

Blending Search and Discovery: Tag-Based Query Refinement with Contextual Reinforcement Learning

291   0   0.0 ( 0 )
 نشر من قبل Bingqing Yu
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We tackle tag-based query refinement as a mobile-friendly alternative to standard facet search. We approach the inference challenge with reinforcement learning, and propose a deep contextual bandit that can be efficiently scaled in a multi-tenant SaaS scenario.

قيم البحث

اقرأ أيضاً

It is a long-standing question to discover causal relations among a set of variables in many empirical sciences. Recently, Reinforcement Learning (RL) has achieved promising results in causal discovery from observational data. However, searching the space of directed graphs and enforcing acyclicity by implicit penalties tend to be inefficient and restrict the existing RL-based method to small scale problems. In this work, we propose a novel RL-based approach for causal discovery, by incorporating RL into the ordering-based paradigm. Specifically, we formulate the ordering search problem as a multi-step Markov decision process, implement the ordering generating process with an encoder-decoder architecture, and finally use RL to optimize the proposed model based on the reward mechanisms designed for~each ordering. A generated ordering would then be processed using variable selection to obtain the final causal graph. We analyze the consistency and computational complexity of the proposed method, and empirically show that a pretrained model can be exploited to accelerate training. Experimental results on both synthetic and real data sets shows that the proposed method achieves a much improved performance over existing RL-based method.
Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots. Although contextual reinforcement learning provides a principled framework for learning and generalization of behaviors acros s related tasks, it generally relies on uninformed sampling of environments from an unknown, uncontrolled context distribution, thus missing the benefits of structured, sequential learning. We introduce a novel relative entropy reinforcement learning algorithm that gives the agent the freedom to control the intermediate task distribution, allowing for its gradual progression towards the target context distribution. Empirical evaluation shows that the proposed curriculum learning scheme drastically improves sample efficiency and enables learning in scenarios with both broad and sharp target context distributions in which classical approaches perform sub-optimally.
Natural Language Search (NLS) extends the capabilities of search engines that perform keyword search allowing users to issue queries in a more natural language. The engine tries to understand the meaning of the queries and to map the query words to t he symbols it supports like Persons, Organizations, Time Expressions etc.. It, then, retrieves the information that satisfies the users need in different forms like an answer, a record or a list of records. We present an NLS system we implemented as part of the Search service of a major CRM platform. The system is currently in production serving thousands of customers. Our user studies showed that creating dynamic reports with NLS saved more than 50% of our users time compared to achieving the same result with navigational search. We describe the architecture of the system, the particularities of the CRM domain as well as how they have influenced our design decisions. Among several submodules of the system we detail the role of a Deep Learning Named Entity Recognizer. The paper concludes with discussion over the lessons learned while developing this product.
Code search is a common practice for developers during software implementation. The challenges of accurate code search mainly lie in the knowledge gap between source code and natural language (i.e., queries). Due to the limited code-query pairs and l arge code-description pairs available, the prior studies based on deep learning techniques focus on learning the semantic matching relation between source code and corresponding description texts for the task, and hypothesize that the semantic gap between descriptions and user queries is marginal. In this work, we found that the code search models trained on code-description pairs may not perform well on user queries, which indicates the semantic distance between queries and code descriptions. To mitigate the semantic distance for more effective code search, we propose QueCos, a Query-enriched Code search model. QueCos learns to generate semantic enriched queries to capture the key semantics of given queries with reinforcement learning (RL). With RL, the code search performance is considered as a reward for producing accurate semantic enriched queries. The enriched queries are finally employed for code search. Experiments on the benchmark datasets show that QueCos can significantly outperform the state-of-the-art code search models.
Large-scale finite element simulations of complex physical systems governed by partial differential equations crucially depend on adaptive mesh refinement (AMR) to allocate computational budget to regions where higher resolution is required. Existing scalable AMR methods make heuristic refinement decisions based on instantaneous error estimation and thus do not aim for long-term optimality over an entire simulation. We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning (RL) to train refinement policies directly from simulation. AMR poses a new problem for RL in that both the state dimension and available action set changes at every step, which we solve by proposing new policy architectures with differing generality and inductive bias. The model sizes of these policy architectures are independent of the mesh size and hence scale to arbitrarily large and complex simulations. We demonstrate in comprehensive experiments on static function estimation and the advection of different fields that RL policies can be competitive with a widely-used error estimator and generalize to larger, more complex, and unseen test problems.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا