No Arabic abstract
We introduce GrapAL (Graph database of Academic Literature), a versatile tool for exploring and investigating a knowledge base of scientific literature, that was semi-automatically constructed using NLP methods. GrapAL satisfies a variety of use cases and information needs requested by researchers. At the core of GrapAL is a Neo4j graph database with an intuitive schema and a simple query language. In this paper, we describe the basic elements of GrapAL, how to use it, and several use cases such as finding experts on a given topic for peer reviewing, discovering indirect connections between biomedical entities and computing citation-based metrics. We open source the demo code to help other researchers develop applications that build on GrapAL.
The Geant4 reference paper published in Nuclear Instruments and Methods A in 2003 has become the most cited publication in the whole Nuclear Science and Technology category of Thomson-Reuters Journal Citation Reports. It is currently the second most cited article among the publications authored by two major research institutes, CERN and INFN. An overview of Geant4 presence (and absence) in scholarly literature is presented; the patterns of Geant4 citations are quantitatively examined and discussed.
Scholars frequently employ relatedness measures to estimate the similarity between two different items (e.g., documents, authors, and institutes). Such relatedness measures are commonly based on overlapping references ($textit{i.e.}$, bibliographic coupling) or citations ($textit{i.e.}$, co-citation) and can then be used with cluster analysis to find boundaries between research fields. Unfortunately, calculating a relatedness measure is challenging, especially for a large number of items, because the computational complexity is greater than linear. We propose an alternative method for identifying the research front that uses direct citation inspired by relatedness measures. Our novel approach simply replicates a node into two distinct nodes: a citing node and cited node. We then apply typical clustering methods to the modified network. Clusters of citing nodes should emulate those from the bibliographic coupling relatedness network, while clusters of cited nodes should act like those from the co-citation relatedness network. In validation tests, our proposed method demonstrated high levels of similarity with conventional relatedness-based methods. We also found that the clustering results of proposed method outperformed those of conventional relatedness-based measures regarding similarity with natural language processing--based classification.
The broad coverage of the search for the Higgs boson in the mainstream media is a relative novelty for high energy physics (HEP) research, whose achievements have traditionally been limited to scholarly literature. This paper illustrates the results of a scientometric analysis of HEP computing in scientific literature, institutional media and the press, and a comparative overview of similar metrics concerning representative particle physics measurements. The picture emerging from these scientometric data documents the scientific impact and social perception of HEP computing. The results of this analysis suggest that improved communication of the scientific and social role of HEP computing would be beneficial to the high energy physics community.
The Center for Expanded Data Annotation and Retrieval (CEDAR) aims to revolutionize the way that metadata describing scientific experiments are authored. The software we have developed--the CEDAR Workbench--is a suite of Web-based tools and REST APIs that allows users to construct metadata templates, to fill in templates to generate high-quality metadata, and to share and manage these resources. The CEDAR Workbench provides a versatile, REST-based environment for authoring metadata that are enriched with terms from ontologies. The metadata are available as JSON, JSON-LD, or RDF for easy integration in scientific applications and reusability on the Web. Users can leverage our APIs for validating and submitting metadata to external repositories. The CEDAR Workbench is freely available and open-source.
Allometric scaling can reflect underlying mechanisms, dynamics and structures in complex systems; examples include typical scaling laws in biology, ecology and urban development. In this work, we study allometric scaling in scientific fields. By performing an analysis of the outputs/inputs of various scientific fields, including the numbers of publications, citations, and references, with respect to the number of authors, we find that in all fields that we have studied thus far, including physics, mathematics and economics, there are allometric scaling laws relating the outputs/inputs and the sizes of scientific fields. Furthermore, the exponents of the scaling relations have remained quite stable over the years. We also find that the deviations of individual subfields from the overall scaling laws are good indicators for ranking subfields independently of their sizes.