No Arabic abstract
Scientists always look for the most accurate and relevant answer to their queries on the scholarly literature. Traditional scholarly search systems list documents instead of providing direct answers to the search queries. As data in knowledge graphs are not acquainted semantically, they are not machine-readable. Therefore, a search on scholarly knowledge graphs ends up in a full-text search, not a search in the content of scholarly literature. In this demo, we present a faceted search system that retrieves data from a scholarly knowledge graph, which can be compared and filtered to better satisfy user information needs. Our practices novelty is that we use dynamic facets, which means facets are not fixed and will change according to the content of a comparison.
Scientists always look for the most accurate and relevant answers to their queries in the literature. Traditional scholarly digital libraries list documents in search results, and therefore are unable to provide precise answers to search queries. In other words, search in digital libraries is metadata search and, if available, full-text search. We present a methodology for improving a faceted search system on structured content by leveraging a federation of scholarly knowledge graphs. We implemented the methodology on top of a scholarly knowledge graph. This search system can leverage content from third-party knowledge graphs to improve the exploration of scholarly content. A novelty of our approach is that we use dynamic facets on diverse data types, meaning that facets can change according to the user query. The user can also adjust the granularity of dynamic facets. An additional novelty is that we leverage third-party knowledge graphs to improve exploring scholarly knowledge.
Answering questions on scholarly knowledge comprising text and other artifacts is a vital part of any research life cycle. Querying scholarly knowledge and retrieving suitable answers is currently hardly possible due to the following primary reason: machine inactionable, ambiguous and unstructured content in publications. We present JarvisQA, a BERT based system to answer questions on tabular views of scholarly knowledge graphs. Such tables can be found in a variety of shapes in the scholarly literature (e.g., surveys, comparisons or results). Our system can retrieve direct answers to a variety of different questions asked on tabular data in articles. Furthermore, we present a preliminary dataset of related tables and a corresponding set of natural language questions. This dataset is used as a benchmark for our system and can be reused by others. Additionally, JarvisQA is evaluated on two datasets against other baselines and shows an improvement of two to three folds in performance compared to related methods.
The incompleteness of Knowledge Graphs (KGs) is a crucial issue affecting the quality of AI-based services. In the scholarly domain, KGs describing research publications typically lack important information, hindering our ability to analyse and predict research dynamics. In recent years, link prediction approaches based on Knowledge Graph Embedding models became the first aid for this issue. In this work, we present Trans4E, a novel embedding model that is particularly fit for KGs which include N to M relations with N$gg$M. This is typical for KGs that categorize a large number of entities (e.g., research articles, patents, persons) according to a relatively small set of categories. Trans4E was applied on two large-scale knowledge graphs, the Academia/Industry DynAmics (AIDA) and Microsoft Academic Graph (MAG), for completing the information about Fields of Study (e.g., neural networks, machine learning, artificial intelligence), and affiliation types (e.g., education, company, government), improving the scope and accuracy of the resulting data. We evaluated our approach against alternative solutions on AIDA, MAG, and four other benchmarks (FB15k, FB15k-237, WN18, and WN18RR). Trans4E outperforms the other models when using low embedding dimensions and obtains competitive results in high dimensions.
Due to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a scholarly knowledge graph leveraging literature survey articles. Survey articles often contain manually curated and high-quality tabular information that summarizes findings published in the scientific literature. Consequently, survey articles are an excellent resource for generating a scholarly knowledge graph. The presented methodology consists of five steps, in which tables and references are extracted from PDF articles, tables are formatted and finally ingested into the knowledge graph. To evaluate the methodology, 92 survey articles, containing 160 survey tables, have been imported in the graph. In total, 2,626 papers have been added to the knowledge graph using the presented methodology. The results demonstrate the feasibility of our approach, but also indicate that manual effort is required and thus underscore the important role of human experts.
The vast amount of research produced at institutions world-wide is extremely diverse, and coarse-grained quantitative measures of impact often obscure the individual contributions of these institutions to specific research fields and topics. We show that by applying an information retrieval model to index research articles which are faceted by institution and time, we can develop tools to rank institutions given a keyword query. We present an interactive atlas, Quoka, designed to enable a user to explore these rankings contextually by geography and over time. Through a set of use cases we demonstrate that the atlas can be used to perform sensemaking tasks to learn and collect information about the relationships between institutions and scholarly knowledge production.