ترغب بنشر مسار تعليمي؟ اضغط هنا

Balancing Reinforcement Learning Training Experiences in Interactive Information Retrieval

79   0   0.0 ( 0 )
 نشر من قبل Zhiwen Tang
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Interactive Information Retrieval (IIR) and Reinforcement Learning (RL) share many commonalities, including an agent who learns while interacts, a long-term and complex goal, and an algorithm that explores and adapts. To successfully apply RL methods to IIR, one challenge is to obtain sufficient relevance labels to train the RL agents, which are infamously known as sample inefficient. However, in a text corpus annotated for a given query, it is not the relevant documents but the irrelevant documents that predominate. This would cause very unbalanced training experiences for the agent and prevent it from learning any policy that is effective. Our paper addresses this issue by using domain randomization to synthesize more relevant documents for the training. Our experimental results on the Text REtrieval Conference (TREC) Dynamic Domain (DD) 2017 Track show that the proposed method is able to boost an RL agents learning effectiveness by 22% in dealing with unseen situations.



قيم البحث

اقرأ أيضاً

In Interactive Information Retrieval (IIR) experiments the users gaze motion on web pages is often recorded with eye tracking. The data is used to analyze gaze behavior or to identify Areas of Interest (AOI) the user has looked at. So far, tools for analyzing eye tracking data have certain limitations in supporting the analysis of gaze behavior in IIR experiments. Experiments often consist of a huge number of different visited web pages. In existing analysis tools the data can only be analyzed in videos or images and AOIs for every single web page have to be specified by hand, in a very time consuming process. In this work, we propose the reading protocol software which breaks eye tracking data down to the textual level by considering the HTML structure of the web pages. This has a lot of advantages for the analyst. First and foremost, it can easily be identified on a large scale what has actually been viewed and read on the stimuli pages by the subjects. Second, the web page structure can be used to filter to AOIs. Third, gaze data of multiple users can be presented on the same page, and fourth, fixation times on text can be exported and further processed in other tools. We present the software, its validation, and example use cases with data from three existing IIR experiments.
Interactive recommendation aims to learn from dynamic interactions between items and users to achieve responsiveness and accuracy. Reinforcement learning is inherently advantageous for coping with dynamic environments and thus has attracted increasin g attention in interactive recommendation research. Inspired by knowledge-aware recommendation, we proposed Knowledge-Guided deep Reinforcement learning (KGRL) to harness the advantages of both reinforcement learning and knowledge graphs for interactive recommendation. This model is implemented upon the actor-critic network framework. It maintains a local knowledge network to guide decision-making and employs the attention mechanism to capture long-term semantics between items. We have conducted comprehensive experiments in a simulated online environment with six public real-world datasets and demonstrated the superiority of our model over several state-of-the-art methods.
272 - Zeeshan Ahmed 2011
PDM Systems contain and manage heavy amount of data but the search mechanism of most of the systems is not intelligent which can process users natural language based queries to extract desired information. Currently available search mechanisms in alm ost all of the PDM systems are not very efficient and based on old ways of searching information by entering the relevant information to the respective fields of search forms to find out some specific information from attached repositories. Targeting this issue, a thorough research was conducted in fields of PDM Systems and Language Technology. Concerning the PDM System, conducted research provides the information about PDM and PDM Systems in detail. Concerning the field of Language Technology, helps in implementing a search mechanism for PDM Systems to search users needed information by analyzing users natural language based requests. The accomplished goal of this research was to support the field of PDM with a new proposition of a conceptual model for the implementation of natural language based search. The proposed conceptual model is successfully designed and partially implementation in the form of a prototype. Describing the proposition in detail the main concept, implementation designs and developed prototype of proposed approach is discussed in this paper. Implemented prototype is compared with respective functions of existing PDM systems .i.e., Windchill and CIM to evaluate its effectiveness against targeted challenges.
Due to its nature of learning from dynamic interactions and planning for long-run performance, reinforcement learning (RL) recently has received much attention in interactive recommender systems (IRSs). IRSs usually face the large discrete action spa ce problem, which makes most of the existing RL-based recommendation methods inefficient. Moreover, data sparsity is another challenging problem that most IRSs are confronted with. While the textual information like reviews and descriptions is less sensitive to sparsity, existing RL-based recommendation methods either neglect or are not suitable for incorporating textual information. To address these two problems, in this paper, we propose a Text-based Deep Deterministic Policy Gradient framework (TDDPG-Rec) for IRSs. Specifically, we leverage textual information to map items and users into a feature space, which greatly alleviates the sparsity problem. Moreover, we design an effective method to construct an action candidate set. By the policy vector dynamically learned from TDDPG-Rec that expresses the users preference, we can select actions from the candidate set effectively. Through experiments on three public datasets, we demonstrate that TDDPG-Rec achieves state-of-the-art performance over several baselines in a time-efficient manner.
73 - Rong Gong , Xavier Serra 2017
Music Information Retrieval (MIR) technologies have been proven useful in assisting western classical singing training. Jingju (also known as Beijing or Peking opera) singing is different from western singing in terms of most of the perceptual dimens ions, and the trainees are taught by using mouth/heart method. In this paper, we first present the training method used in the professional jingju training classroom scenario and show the potential benefits of introducing the MIR technologies into the training process. The main part of this paper dedicates to identify the potential MIR technologies for jingju singing training. To this intent, we answer the question: how the jingju singing tutors and trainees value the importance of each jingju musical dimension-intonation, rhythm, loudness, tone quality and pronunciation? This is done by (i) classifying the classroom singing practices, tutors verbal feedbacks into these 5 dimensions, (ii) surveying the trainees. Then, with the help of the music signal analysis, a finer inspection on the classroom practice recording examples reveals the detailed elements in the training process. Finally, based on the above analysis, several potential MIR technologies are identified and would be useful for the jingju singing training.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا