ﻻ يوجد ملخص باللغة العربية
The ad-hoc retrieval task is to rank related documents given a query and a document collection. A series of deep learning based approaches have been proposed to solve such problem and gained lots of attention. However, we argue that they are inherently based on local word sequences, ignoring the subtle long-distance document-level word relationships. To solve the problem, we explicitly model the document-level word relationship through the graph structure, capturing the subtle information via graph neural networks. In addition, due to the complexity and scale of the document collections, it is considerable to explore the different grain-sized hierarchical matching signals at a more general level. Therefore, we propose a Graph-based Hierarchical Relevance Matching model (GHRM) for ad-hoc retrieval, by which we can capture the subtle and general hierarchical matching signals simultaneously. We validate the effects of GHRM over two representative ad-hoc retrieval benchmarks, the comprehensive experiments and results demonstrate its superiority over state-of-the-art methods.
Product quantization (PQ) is a widely used technique for ad-hoc retrieval. Recent studies propose supervised PQ, where the embedding and quantization models can be jointly trained with supervised learning. However, there is a lack of appropriate form
Traditional statistical retrieval models often treat each document as a whole. In many cases, however, a document is relevant to a query only because a small part of it contain the targeted information. In this work, we propose a neural passage model
Designing pre-training objectives that more closely resemble the downstream tasks for pre-trained language models can lead to better performance at the fine-tuning stage, especially in the ad-hoc retrieval area. Existing pre-training approaches tailo
Rapidly growing online podcast archives contain diverse content on a wide range of topics. These archives form an important resource for entertainment and professional use, but their value can only be realized if users can rapidly and reliably locate
On most sponsored search platforms, advertisers bid on some keywords for their advertisements (ads). Given a search request, ad retrieval module rewrites the query into bidding keywords, and uses these keywords as keys to select Top N ads through inv