Term-Mouse-Fixations as an Additional Indicator for Topical User Interests in Domain-Specific Search

64 0 0.0 ( 0 )

Download Cite

Added by Daniel Hienert

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Daniel Hienert - Dagmar Kern

Information Retrieval

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Models in Interactive Information Retrieval (IIR) are grounded very much on the users task in order to give system support based on different task types and topics. However, the automatic recognition of user interests from log data in search systems is not trivial. Search queries entered by users a surely one such source. However, queries may be short, or users are only browsing. In this paper, we propose a method of term-mouse-fixations which takes the fixations on terms users are hovering over with the mouse into consideration to estimate topical user interests. We analyzed 22,259 search sessions of a domain-specific digital library over a period of about four months. We compared these mouse fixations to user-entered search terms and to titles and keywords from documents the user showed an interest in. These terms were found in 87.12% of all analyzed sessions; in this subset of sessions, per session on average 11.46 term-mouse-fixations from queries and viewed documents were found. These terms were fixated significantly longer with about 7 seconds than other terms with about 4.4 seconds. This means, term-mouse-fixations provide indicators for topical user interests and it is possible to extract them based on fixation time.

rate research

User-specific Adaptive Fine-tuning for Cross-domain Recommendations

153 - Lei Chen , Fajie Yuan , Jiaxi Yang 2021

Making accurate recommendations for cold-start users has been a longstanding and critical challenge for recommender systems (RS). Cross-domain recommendations (CDR) offer a solution to tackle such a cold-start problem when there is no sufficient data for the users who have rarely used the system. An effective approach in CDR is to leverage the knowledge (e.g., user representations) learned from a related but different domain and transfer it to the target domain. Fine-tuning works as an effective transfer learning technique for this objective, which adapts the parameters of a pre-trained model from the source domain to the target domain. However, current methods are mainly based on the global fine-tuning strategy: the decision of which layers of the pre-trained model to freeze or fine-tune is taken for all users in the target domain. In this paper, we argue that users in RS are personalized and should have their own fine-tuning policies for better preference transfer learning. As such, we propose a novel User-specific Adaptive Fine-tuning method (UAF), selecting which layers of the pre-trained network to fine-tune, on a per-user basis. Specifically, we devise a policy network with three alternative strategies to automatically decide which layers to be fine-tuned and which layers to have their parameters frozen for each user. Extensive experiments show that the proposed UAF exhibits significantly better and more robust performance for user cold-start recommendation.

Information Retrieval

Aspect-based Academic Search using Domain-specific KB

73 - Prajna Upadhyay , Srikanta Bedathur , Tanmoy Chakraborty 2020

Academic search engines allow scientists to explore related work relevant to a given query. Often, the user is also aware of the aspect to retrieve a relevant document. In such cases, existing search engines can be used by expanding the query with terms describing that aspect. However, this approach does not guarantee good results since plain keyword matches do not always imply relevance. To address this issue, we define and solve a novel academic search task, called aspect-based retrieval, which allows the user to specify the aspect along with the query to retrieve a ranked list of relevant documents. The primary idea is to estimate a language model for the aspect as well as the query using a domain-specific knowledge base and use a mixture of the two to determine the relevance of the article. Our evaluation of the results over the Open Research Corpus dataset shows that our method outperforms keyword-based expansion of query with aspect with and without relevance feedback.

Information Retrieval

Topical Result Caching in Web Search Engines

77 - Ida Mele , Nicola Tonellotto , Ophir Frieder 2020

Caching search results is employed in information retrieval systems to expedite query processing and reduce back-end server workload. Motivated by the observation that queries belonging to different topics have different temporal-locality patterns, we investigate a novel caching model called STD (Static-Topic-Dynamic cache). It improves traditional SDC (Static-Dynamic Cache) that stores in a static cache the results of popular queries and manages the dynamic cache with a replacement policy for intercepting the temporal variations in the query stream. Our proposed caching scheme includes another layer for topic-based caching, where the entries are allocated to different topics (e.g., weather, education). The results of queries characterized by a topic are kept in the fraction of the cache dedicated to it. This permits to adapt the cache-space utilization to the temporal locality of the various topics and reduces cache misses due to those queries that are neither sufficiently popular to be in the static portion nor requested within short-time intervals to be in the dynamic portion. We simulate different configurations for STD using two real-world query streams. Experiments demonstrate that our approach outperforms SDC with an increase up to 3% in terms of hit rates, and up to 36% of gap reduction w.r.t. SDC from the theoretical optimal caching algorithm.

Information Retrieval Databases

Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature

334 - Yu Wang , Jinchao Li , Tristan Naumann 2021

Information overload is a prevalent challenge in many high-value domains. A prominent case in point is the explosion of the biomedical literature on COVID-19, which swelled to hundreds of thousands of papers in a matter of months. In general, biomedical literature expands by two papers every minute, totalling over a million new papers every year. Search in the biomedical realm, and many other vertical domains is challenging due to the scarcity of direct supervision from click logs. Self-supervised learning has emerged as a promising direction to overcome the annotation bottleneck. We propose a general approach for vertical search based on domain-specific pretraining and present a case study for the biomedical domain. Despite being substantially simpler and not using any relevance labels for training or development, our method performs comparably or better than the best systems in the official TREC-COVID evaluation, a COVID-related biomedical search competition. Using distributed computing in modern cloud infrastructure, our system can scale to tens of millions of articles on PubMed and has been deployed as Microsoft Biomedical Search, a new search experience for biomedical literature: https://aka.ms/biomedsearch.

Information Retrieval Computation and Language Digital Libraries

A Domain Specific Ontology Based Semantic Web Search Engine

113 - Debajyoti Mukhopadhyay , Aritra Banik , Sreemoyee Mukherjee 2011

Since its emergence in the 1990s the World Wide Web (WWW) has rapidly evolved into a huge mine of global information and it is growing in size everyday. The presence of huge amount of resources on the Web thus poses a serious problem of accurate search. This is mainly because todays Web is a human-readable Web where information cannot be easily processed by machine. Highly sophisticated, efficient keyword based search engines that have evolved today have not been able to bridge this gap. So comes up the concept of the Semantic Web which is envisioned by Tim Berners-Lee as the Web of machine interpretable information to make a machine processable form for expressing information. Based on the semantic Web technologies we present in this paper the design methodology and development of a semantic Web search engine which provides exact search results for a domain specific search. This search engine is developed for an agricultural Website which hosts agricultural information about the state of West Bengal.

Information Retrieval