ترغب بنشر مسار تعليمي؟ اضغط هنا

An IR-based Evaluation Framework for Web Search Query Segmentation

65   0   0.0 ( 0 )
 نشر من قبل Rishiraj Saha Roy
 تاريخ النشر 2011
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper presents the first evaluation framework for Web search query segmentation based directly on IR performance. In the past, segmentation strategies were mainly validated against manual annotations. Our work shows that the goodness of a segmentation algorithm as judged through evaluation against a handful of human annotated segmentations hardly reflects its effectiveness in an IR-based setup. In fact, state-of the-art algorithms are shown to perform as good as, and sometimes even better than human annotations -- a fact masked by previous validations. The proposed framework also provides us an objective understanding of the gap between the present best and the best possible segmentation algorithm. We draw these conclusions based on an extensive evaluation of six segmentation strategies, including three most recent algorithms, vis-a-vis segmentations from three human annotators. The evaluation framework also gives insights about which segments should be necessarily detected by an algorithm for achieving the best retrieval results. The meticulously constructed dataset used in our experiments has been made public for use by the research community.

قيم البحث

اقرأ أيضاً

Influenza-like illness (ILI) estimation from web search data is an important web analytics task. The basic idea is to use the frequencies of queries in web search logs that are correlated with past ILI activity as features when estimating current ILI activity. It has been noted that since influenza is seasonal, this approach can lead to spurious correlations with features/queries that also exhibit seasonality, but have no relationship with ILI. Spurious correlations can, in turn, degrade performance. To address this issue, we propose modeling the seasonal variation in ILI activity and selecting queries that are correlated with the residual of the seasonal model and the observed ILI signal. Experimental results show that re-ranking queries obtained by Google Correlate based on their correlation with the residual strongly favours ILI-related queries.
Understanding search queries is critical for shopping search engines to deliver a satisfying customer experience. Popular shopping search engines receive billions of unique queries yearly, each of which can depict any of hundreds of user preferences or intents. In order to get the right results to customers it must be known queries like inexpensive prom dresses are intended to not only surface results of a certain product type but also products with a low price. Referred to as query intents, examples also include preferences for author, brand, age group, or simply a need for customer service. Recent works such as BERT have demonstrated the success of a large transformer encoder architecture with language model pre-training on a variety of NLP tasks. We adapt such an architecture to learn intents for search queries and describe methods to account for the noisiness and sparseness of search query data. We also describe cost effective ways of hosting transformer encoder models in context with low latency requirements. With the right domain-specific training we can build a shareable deep learning model whose internal representation can be reused for a variety of query understanding tasks including query intent identification. Model sharing allows for fewer large models needed to be served at inference time and provides a platform to quickly build and roll out new search query classifiers.
Since its emergence in the 1990s the World Wide Web (WWW) has rapidly evolved into a huge mine of global information and it is growing in size everyday. The presence of huge amount of resources on the Web thus poses a serious problem of accurate sear ch. This is mainly because todays Web is a human-readable Web where information cannot be easily processed by machine. Highly sophisticated, efficient keyword based search engines that have evolved today have not been able to bridge this gap. So comes up the concept of the Semantic Web which is envisioned by Tim Berners-Lee as the Web of machine interpretable information to make a machine processable form for expressing information. Based on the semantic Web technologies we present in this paper the design methodology and development of a semantic Web search engine which provides exact search results for a domain specific search. This search engine is developed for an agricultural Website which hosts agricultural information about the state of West Bengal.
272 - Maksims Volkovs 2015
We present our solution to the Yandex Personalized Web Search Challenge. The aim of this challenge was to use the historical search logs to personalize top-N document rankings for a set of test users. We used over 100 features extracted from user- an d query-depended contexts to train neural net and tree-based learning-to-rank and regression models. Our final submission, which was a blend of several different models, achieved an NDCG@10 of 0.80476 and placed 4th amongst the 194 teams winning 3rd prize.
Geographic location search engines allow users to constrain and order search results in an intuitive manner by focusing a query on a particular geographic region. Geographic search technology, also called location search, has recently received signif icant interest from major search engine companies. Academic research in this area has focused primarily on techniques for extracting geographic knowledge from the web. In this paper, we study the problem of efficient query processing in scalable geographic search engines. Query processing is a major bottleneck in standard web search engines, and the main reason for the thousands of machines used by the major engines. Geographic search engine query processing is different in that it requires a combination of text and spatial data processing techniques. We propose several algorithms for efficient query processing in geographic search engines, integrate them into an existing web search query processor, and evaluate them on large sets of real data and query traces.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا