No Arabic abstract
Explicitly modelling field interactions and correlations in complex document structures has recently gained popularity in neural document embedding and retrieval tasks. Although this requires the specification of bespoke task-dependent models, encouraging empirical results are beginning to emerge. We present the first in-depth analyses of non-linear multi-field interaction (NL-MFI) ranking in the cooking domain in this work. Our results show that field-weighted factorisation machines models provide a statistically significant improvement over baselines in recipe retrieval tasks. Additionally, we show that sparsely capturing subsets of field interactions based on domain knowledge and feature selection heuristics offers significant advantages over baselines and exhaustive alternatives. Although field-interaction aware models are more elaborate from an architectural basis, they are often more data-efficient in optimisation and are better suited for explainability due to mirrored document and model factorisation.
Ranking models are the main components of information retrieval systems. Several approaches to ranking are based on traditional machine learning algorithms using a set of hand-crafted features. Recently, researchers have leveraged deep learning models in information retrieval. These models are trained end-to-end to extract features from the raw data for ranking tasks, so that they overcome the limitations of hand-crafted features. A variety of deep learning models have been proposed, and each model presents a set of neural network components to extract features that are used for ranking. In this paper, we compare the proposed models in the literature along different dimensions in order to understand the major contributions and limitations of each model. In our discussion of the literature, we analyze the promising neural components, and propose future research directions. We also show the analogy between document retrieval and other retrieval tasks where the items to be ranked are structured documents, answers, images and videos.
Recently, we have witnessed the bloom of neural ranking models in the information retrieval (IR) field. So far, much effort has been devoted to developing effective neural ranking models that can generalize well on new data. There has been less attention paid to the robustness perspective. Unlike the effectiveness which is about the average performance of a system under normal purpose, robustness cares more about the system performance in the worst case or under malicious operations instead. When a new technique enters into the real-world application, it is critical to know not only how it works in average, but also how would it behave in abnormal situations. So we raise the question in this work: Are neural ranking models robust? To answer this question, firstly, we need to clarify what we refer to when we talk about the robustness of ranking models in IR. We show that robustness is actually a multi-dimensional concept and there are three ways to define it in IR: 1) The performance variance under the independent and identically distributed (I.I.D.) setting; 2) The out-of-distribution (OOD) generalizability; and 3) The defensive ability against adversarial operations. The latter two definitions can be further specified into two different perspectives respectively, leading to 5 robustness tasks in total. Based on this taxonomy, we build corresponding benchmark datasets, design empirical experiments, and systematically analyze the robustness of several representative neural ranking models against traditional probabilistic ranking models and learning-to-rank (LTR) models. The empirical results show that there is no simple answer to our question. While neural ranking models are less robust against other IR models in most cases, some of them can still win 1 out of 5 tasks. This is the first comprehensive study on the robustness of neural ranking models.
We present Covidex, a search engine that exploits the latest neural ranking models to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI. Our system has been online and serving users since late March 2020. The Covidex is the user application component of our three-pronged strategy to develop technologies for helping domain experts tackle the ongoing global pandemic. In addition, we provide robust and easy-to-use keyword search infrastructure that exploits mature fusion-based methods as well as standalone neural ranking models that can be incorporated into other applications. These techniques have been evaluated in the ongoing TREC-COVID challenge: Our infrastructure and baselines have been adopted by many participants, including some of the highest-scoring runs in rounds 1, 2, and 3. In round 3, we report the highest-scoring run that takes advantage of previous training data and the second-highest fully automatic run.
Ranking tasks are usually based on the text of the main body of the page and the actions (clicks) of users on the page. There are other elements that could be leveraged to better contextualise the ranking experience (e.g. text in other fields, query made by the user, images, etc). We present one of the first in-depth analyses of field interaction for multiple field ranking in two separate datasets. While some works have taken advantage of full document structure, some aspects remain unexplored. In this work we build on previous analyses to show how query-field interactions, non-linear field interactions, and the architecture of the underlying neural model affect performance.
As a critical component for online advertising and marking, click-through rate (CTR) prediction has draw lots of attentions from both industry and academia field. Recently, the deep learning has become the mainstream methodological choice for CTR. Despite of sustainable efforts have been made, existing approaches still pose several challenges. On the one hand, high-order interaction between the features is under-explored. On the other hand, high-order interactions may neglect the semantic information from the low-order fields. In this paper, we proposed a novel prediction method, named FINT, that employs the Field-aware INTeraction layer which captures high-order feature interactions while retaining the low-order field information. To empirically investigate the effectiveness and robustness of the FINT, we perform extensive experiments on the three realistic databases: KDD2012, Criteo and Avazu. The obtained results demonstrate that the FINT can significantly improve the performance compared to the existing methods, without increasing the amount of computation required. Moreover, the proposed method brought about 2.72% increase to the advertising revenue of a big online video app through A/B testing. To better promote the research in CTR field, we released our code as well as reference implementation at: https://github.com/zhishan01/FINT.