ترغب بنشر مسار تعليمي؟ اضغط هنا

Unbiased evaluation of ranking metrics reveals consistent performance in science and technology citation data

162   0   0.0 ( 0 )
 نشر من قبل Matus Medo
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Despite the increasing use of citation-based metrics for research evaluation purposes, we do not know yet which metrics best deliver on their promise to gauge the significance of a scientific paper or a patent. We assess 17 network-based metrics by their ability to identify milestone papers and patents in three large citation datasets. We find that traditional information-retrieval evaluation metrics are strongly affected by the interplay between the age distribution of the milestone items and age biases of the evaluated metrics. Outcomes of these metrics are therefore not representative of the metrics ranking ability. We argue in favor of a modified evaluation procedure that explicitly penalizes biased metrics and allows us to reveal metrics performance patterns that are consistent across the datasets. PageRank and LeaderRank turn out to be the best-performing ranking metrics when their age bias is suppressed by a simple transformation of the scores that they produce, whereas other popular metrics, including citation count, HITS and Collective Influence, produce significantly worse ranking results.



قيم البحث

اقرأ أيضاً

170 - Massimo Franceschet 2011
We analyse the large-scale structure of the journal citation network built from information contained in the Thomson-Reuters Journal Citation Reports. To this end, we take advantage of the network science paraphernalia and explore network properties like density, percolation robustness, average and largest node distances, reciprocity, incoming and outgoing degree distributions, as well as assortative mixing by node degrees. We discover that the journal citation network is a dense, robust, small, and reciprocal world. Furthermore, in and out node degree distributions display long-tails, with few vital journals and many trivial ones, and they are strongly positively correlated.
We analyze the role of first (leading) author gender on the number of citations that a paper receives, on the publishing frequency and on the self-citing tendency. We consider a complete sample of over 200,000 publications from 1950 to 2015 from five major astronomy journals. We determine the gender of the first author for over 70% of all publications. The fraction of papers which have a female first author has increased from less than 5% in the 1960s to about 25% today. We find that the increase of the fraction of papers authored by females is slowest in the most prestigious journals such as Science and Nature. Furthermore, female authors write 19$pm$7% fewer papers in seven years following their first paper than their male colleagues. At all times papers with male first authors receive more citations than papers with female first authors. This difference has been decreasing with time and amounts to $sim$6% measured over the last 30 years. To account for the fact that the properties of female and male first author papers differ intrinsically, we use a random forest algorithm to control for the non-gender specific properties of these papers which include seniority of the first author, number of references, total number of authors, year of publication, publication journal, field of study and region of the first authors institution. We show that papers authored by females receive 10.4$pm$0.9% fewer citations than what would be expected if the papers with the same non-gender specific properties were written by the male authors. Finally, we also find that female authors in our sample tend to self-cite more, but that this effect disappears when controlled for non-gender specific variables.
While implicit feedback (e.g., clicks, dwell times, etc.) is an abundant and attractive source of data for learning to rank, it can produce unfair ranking policies for both exogenous and endogenous reasons. Exogenous reasons typically manifest themse lves as biases in the training data, which then get reflected in the learned ranking policy and often lead to rich-get-richer dynamics. Moreover, even after the correction of such biases, reasons endogenous to the design of the learning algorithm can still lead to ranking policies that do not allocate exposure among items in a fair way. To address both exogenous and endogenous sources of unfairness, we present the first learning-to-rank approach that addresses both presentation bias and merit-based fairness of exposure simultaneously. Specifically, we define a class of amortized fairness-of-exposure constraints that can be chosen based on the needs of an application, and we show how these fairness criteria can be enforced despite the selection biases in implicit feedback data. The key result is an efficient and flexible policy-gradient algorithm, called FULTR, which is the first to enable the use of counterfactual estimators for both utility estimation and fairness constraints. Beyond the theoretical justification of the framework, we show empirically that the proposed algorithm can learn accurate and fair ranking policies from biased and noisy feedback.
119 - Massimo Franceschet 2011
We represent collaboration of authors in computer science papers in terms of both affiliation and collaboration networks and observe how these networks evolved over time since 1960. We investigate the temporal evolution of bibliometric properties, li ke size of the discipline, productivity of scholars, and collaboration level in papers, as well as of large-scale network properties, like reachability and average separation distance among scientists, distribution of the number of scholar collaborators, network clustering and network assortativity by number of collaborators.
Hierarchical domain-specific classification schemas (or subject heading vocabularies) are often used to identify, classify, and disambiguate concepts that occur in scholarly articles. In this work, we develop, apply, and evaluate a human-in-the-loop workflow that first extracts an initial category tree from crowd-sourced Wikipedia data, and then combines community detection, machine learning, and hand-crafted heuristics or rules to prune the initial tree. This work resulted in WikiCSSH; a large-scale, hierarchically organized vocabulary for the domain of computer science (CS). Our evaluation suggests that WikiCSSH outperforms alternative CS vocabularies in terms of vocabulary size as well as the performance of lexicon-based key-phrase extraction from scholarly data. WikiCSSH can further distinguish between coarse-grained versus fine-grained CS concepts. The outlined workflow can serve as a template for building hierarchically-organized subject heading vocabularies for other domains that are covered in Wikipedia.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا