ترغب بنشر مسار تعليمي؟ اضغط هنا

Ranking Robustness Under Adversarial Document Manipulations

118   0   0.0 ( 0 )
 نشر من قبل Gregory Goren
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

For many queries in the Web retrieval setting there is an on-going ranking competition: authors manipulate their documents so as to promote them in rankings. Such competitions can have unwarranted effects not only in terms of retrieval effectiveness, but also in terms of ranking robustness. A case in point, rankings can (rapidly) change due to small indiscernible perturbations of documents. While there has been a recent growing interest in analyzing the robustness of classifiers to adversarial manipulations, there has not yet been a study of the robustness of relevance-ranking functions. We address this challenge by formally analyzing different definitions and aspects of the robustness of learning-to-rank-based ranking functions. For example, we formally show that increased regularization of linear ranking functions increases ranking robustness. This finding leads us to conjecture that decreased variance of any ranking function results in increased robustness. We propose several measures for quantifying ranking robustness and use them to analyze ranking competitions between documents authors. The empirical findings support our formal analysis and conjecture for both RankSVM and LambdaMART.



قيم البحث

اقرأ أيضاً

Ranking models are the main components of information retrieval systems. Several approaches to ranking are based on traditional machine learning algorithms using a set of hand-crafted features. Recently, researchers have leveraged deep learning model s in information retrieval. These models are trained end-to-end to extract features from the raw data for ranking tasks, so that they overcome the limitations of hand-crafted features. A variety of deep learning models have been proposed, and each model presents a set of neural network components to extract features that are used for ranking. In this paper, we compare the proposed models in the literature along different dimensions in order to understand the major contributions and limitations of each model. In our discussion of the literature, we analyze the promising neural components, and propose future research directions. We also show the analogy between document retrieval and other retrieval tasks where the items to be ranked are structured documents, answers, images and videos.
Ranking tasks are usually based on the text of the main body of the page and the actions (clicks) of users on the page. There are other elements that could be leveraged to better contextualise the ranking experience (e.g. text in other fields, query made by the user, images, etc). We present one of the first in-depth analyses of field interaction for multiple field ranking in two separate datasets. While some works have taken advantage of full document structure, some aspects remain unexplored. In this work we build on previous analyses to show how query-field interactions, non-linear field interactions, and the architecture of the underlying neural model affect performance.
This work proposes a novel adaptation of a pretrained sequence-to-sequence model to the task of document ranking. Our approach is fundamentally different from a commonly-adopted classification-based formulation of ranking, based on encoder-only pretr ained transformer architectures such as BERT. We show how a sequence-to-sequence model can be trained to generate relevance labels as target words, and how the underlying logits of these target words can be interpreted as relevance probabilities for ranking. On the popular MS MARCO passage ranking task, experimental results show that our approach is at least on par with previous classification-based models and can surpass them with larger, more-recent models. On the test collection from the TREC 2004 Robust Track, we demonstrate a zero-shot transfer-based approach that outperforms previous state-of-the-art models requiring in-dataset cross-validation. Furthermore, we find that our approach significantly outperforms an encoder-only model in a data-poor regime (i.e., with few training examples). We investigate this observation further by varying target words to probe the models use of latent knowledge.
Context information in search sessions has proven to be useful for capturing user search intent. Existing studies explored user behavior sequences in sessions in different ways to enhance query suggestion or document ranking. However, a user behavior sequence has often been viewed as a definite and exact signal reflecting a users behavior. In reality, it is highly variable: users queries for the same intent can vary, and different documents can be clicked. To learn a more robust representation of the user behavior sequence, we propose a method based on contrastive learning, which takes into account the possible variations in users behavior sequences. Specifically, we propose three data augmentation strategies to generate similar variants of user behavior sequences and contrast them with other sequences. In so doing, the model is forced to be more robust regarding the possible variations. The optimized sequence representation is incorporated into document ranking. Experiments on two real query log datasets show that our proposed model outperforms the state-of-the-art methods significantly, which demonstrates the effectiveness of our method for context-aware document ranking.
Recommendation algorithms typically build models based on historical user-item interactions (e.g., clicks, likes, or ratings) to provide a personalized ranked list of items. These interactions are often distributed unevenly over different groups of i tems due to varying user preferences. However, we show that recommendation algorithms can inherit or even amplify this imbalanced distribution, leading to unfair recommendations to item groups. Concretely, we formalize the concepts of ranking-based statistical parity and equal opportunity as two measures of fairness in personalized ranking recommendation for item groups. Then, we empirically show that one of the most widely adopted algorithms -- Bayesian Personalized Ranking -- produces unfair recommendations, which motivates our effort to propose the novel fairness-aware personalized ranking model. The debiased model is able to improve the two proposed fairness metrics while preserving recommendation performance. Experiments on three public datasets show strong fairness improvement of the proposed model versus state-of-the-art alternatives. This is paper is an extended and reorganized version of our SIGIR 2020~cite{zhu2020measuring} paper. In this paper, we re-frame the studied problem as `item recommendation fairness in personalized ranking recommendation systems, and provide more details about the training process of the proposed model and details of experiment setup.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا