ﻻ يوجد ملخص باللغة العربية
Name disambiguation is a key and also a very tough problem in many online systems such as social search and academic search. Despite considerable research, a critical issue that has not been systematically studied is disambiguation on the fly -- to complete the disambiguation in the real-time. This is very challenging, as the disambiguation algorithm must be accurate, efficient, and error tolerance. In this paper, we propose a novel framework -- CONNA -- to train a matching component and a decision component jointly via reinforcement learning. The matching component is responsible for finding the top matched candidate for the given paper, and the decision component is responsible for deciding on assigning the top matched person or creating a new person. The two components are intertwined and can be bootstrapped via jointly training. Empirically, we evaluate CONNA on two name disambiguation datasets. Experimental results show that the proposed framework can achieve a 1.21%-19.84% improvement on F1-score using joint training of the matching and the decision components. The proposed CONNA has been successfully deployed on AMiner -- a large online academic search system.
Author name ambiguity causes inadequacy and inconvenience in academic information retrieval, which raises the necessity of author name disambiguation (AND). Existing AND methods can be divided into two categories: the models focusing on content infor
We present a manually-labeled Author Name Disambiguation(AND) Dataset called WhoisWho, which consists of 399,255 documents and 45,187 distinct authors with 421 ambiguous author names. To label such a great amount of AND data of high accuracy, we prop
Users on Twitter are commonly identified by their profile names. These names are used when directly addressing users on Twitter, are part of their profile page URLs, and can become a trademark for popular accounts, with people referring to celebritie
We present a novel algorithm and validation method for disambiguating author names in very large bibliographic data sets and apply it to the full Web of Science (WoS) citation index. Our algorithm relies only upon the author and citation graphs avail
Author Name Disambiguation (AND) is the task of resolving which author mentions in a bibliographic database refer to the same real-world person, and is a critical ingredient of digital library applications such as search and citation analysis. While