FAIR: Fairness-Aware Information Retrieval Evaluation

269 0 0.0 ( 0 )

Download Cite

Added by Ruoyuan Gao

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Ruoyuan Gao - Yingqiang Ge - Chirag Shah

Information Retrieval

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

With the emerging needs of creating fairness-aware solutions for search and recommendation systems, a daunting challenge exists of evaluating such solutions. While many of the traditional information retrieval (IR) metrics can capture the relevance, diversity and novelty for the utility with respect to users, they are not suitable for inferring whether the presented results are fair from the perspective of responsible information exposure. On the other hand, various fairness metrics have been proposed but they do not account for the user utility or do not measure it adequately. To address this problem, we propose a new metric called Fairness-Aware IR (FAIR). By unifying standard IR metrics and fairness measures into an integrated metric, this metric offers a new perspective for evaluating fairness-aware ranking results. Based on this metric, we developed an effective ranking algorithm that jointly optimized user utility and fairness. The experimental results showed that our FAIR metric could highlight results with good user utility and fair information exposure. We showed how FAIR related to existing metrics and demonstrated the effectiveness of our FAIR-based algorithm. We believe our work opens up a new direction of pursuing a computationally feasible metric for evaluating and implementing the fairness-aware IR systems.

rate research

Adapting Binary Information Retrieval Evaluation Metrics for Segment-based Retrieval Tasks

372 - Robin Aly , Maria Eskevich , Roeland Ordelman 2013

This report describes metrics for the evaluation of the effectiveness of segment-based retrieval based on existing binary information retrieval metrics. This metrics are described in the context of a task for the hyperlinking of video segments. This evaluation approach re-uses existing evaluation measures from the standard Cranfield evaluation paradigm. Our adaptation approach can in principle be used with any kind of effectiveness measure that uses binary relevance, and for other segment-baed retrieval tasks. In our video hyperlinking setting, we use precision at a cut-off rank n and mean average precision.

Information Retrieval

Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval

181 - Bhaskar Mitra 2020

Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents--or short passages--in response to keyword-based queries. Effective IR systems must deal with query-document vocabulary mismatch problem, by modeling relationships between different query and document terms and how they indicate relevance. Models should also consider lexical matches when the query contains rare terms--such as a persons name or a product model number--not seen during training, and to avoid retrieving semantically related but irrelevant results. In many real-life IR tasks, the retrieval involves extremely large collections--such as the document index of a commercial Web search engine--containing billions of documents. Efficient IR methods should take advantage of specialized IR data structures, such as inverted index, to efficiently retrieve from large collections. Given an information need, the IR system also mediates how much exposure an information artifact receives by deciding whether it should be displayed, and where it should be positioned, among other results. Exposure-aware IR systems may optimize for additional objectives, besides relevance, such as parity of exposure for retrieved items and content publishers. In this thesis, we present novel neural architectures and methods motivated by the specific needs and challenges of IR tasks.

Information Retrieval Artificial Intelligence Computation and Language

How Fair is Fairness-aware Representative Ranking and Methods for Fair Ranking

85 - Akrati Saxena , George Fletcher , Mykola Pechenizkiy 2021

Rankings of people and items has been highly used in selection-making, match-making, and recommendation algorithms that have been deployed on ranging of platforms from employment websites to searching tools. The ranking position of a candidate affects the amount of opportunities received by the ranked candidate. It has been observed in several works that the ranking of candidates based on their score can be biased for candidates belonging to the minority community. In recent works, the fairness-aware representative ranking was proposed for computing fairness-aware re-ranking of results. The proposed algorithm achieves the desired distribution of top-ranked results with respect to one or more protected attributes. In this work, we highlight the bias in fairness-aware representative ranking for an individual as well as for a group if the group is sub-active on the platform. We define individual unfairness and group unfairness and propose methods to generate ideal individual and group fair representative ranking if the universal representation ratio is known or unknown. The simulation results show the quantified analysis of fairness in the proposed solutions. The paper is concluded with open challenges and further directions.

Social and Information Networks

Fairness-aware Personalized Ranking Recommendation via Adversarial Learning

85 - Ziwei Zhu , Jianling Wang , James Caverlee 2021

Recommendation algorithms typically build models based on historical user-item interactions (e.g., clicks, likes, or ratings) to provide a personalized ranked list of items. These interactions are often distributed unevenly over different groups of items due to varying user preferences. However, we show that recommendation algorithms can inherit or even amplify this imbalanced distribution, leading to unfair recommendations to item groups. Concretely, we formalize the concepts of ranking-based statistical parity and equal opportunity as two measures of fairness in personalized ranking recommendation for item groups. Then, we empirically show that one of the most widely adopted algorithms -- Bayesian Personalized Ranking -- produces unfair recommendations, which motivates our effort to propose the novel fairness-aware personalized ranking model. The debiased model is able to improve the two proposed fairness metrics while preserving recommendation performance. Experiments on three public datasets show strong fairness improvement of the proposed model versus state-of-the-art alternatives. This is paper is an extended and reorganized version of our SIGIR 2020~cite{zhu2020measuring} paper. In this paper, we re-frame the studied problem as `item recommendation fairness in personalized ranking recommendation systems, and provide more details about the training process of the proposed model and details of experiment setup.

Information Retrieval

Cross-lingual Information Retrieval with BERT

81 - Zhuolin Jiang , Amro El-Jaroudi , William Hartmann 2020

Multiple neural language models have been developed recently, e.g., BERT and XLNet, and achieved impressive results in various NLP tasks including sentence classification, question answering and document ranking. In this paper, we explore the use of the popular bidirectional language model, BERT, to model and learn the relevance between English queries and foreign-language documents in the task of cross-lingual information retrieval. A deep relevance matching model based on BERT is introduced and trained by finetuning a pretrained multilingual BERT model with weak supervision, using home-made CLIR training data derived from parallel corpora. Experimental results of the retrieval of Lithuanian documents against short English queries show that our model is effective and outperforms the competitive baseline approaches.

Information Retrieval Computation and Language Machine Learning