Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

QURATOR: Innovative Technologies for Content and Data Curation

72 0 0.0 ( 0 )

Download Cite

Added by Georg Rehm

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Georg Rehm - Peter Bourgonje - Stefanie Hegele

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In all domains and sectors, the demand for intelligent systems to support the processing and generation of digital content is rapidly increasing. The availability of vast amounts of content and the pressure to publish new content quickly and in rapid succession requires faster, more efficient and smarter processing and generation methods. With a consortium of ten partners from research and industry and a broad range of expertise in AI, Machine Learning and Language Technologies, the QURATOR project, funded by the German Federal Ministry of Education and Research, develops a sustainable and innovative technology platform that provides services to support knowledge workers in various industries to address the challenges they face when curating digital content. The projects vision and ambition is to establish an ecosystem for content curation technologies that significantly pushes the current state of the art and transforms its region, the metropolitan area Berlin-Brandenburg, into a global centre of excellence for curation technologies.

rate research

Connecting web-based mapping services with scientific data repositories: collaborative curation and retrieval of simulation data via a geospatial interface

295 - Christian T. Jacobs , Alexandros Avdis 2016

Increasing quantities of scientific data are becoming readily accessible via online repositories such as those provided by Figshare and Zenodo. Geoscientific simulations in particular generate large quantities of data, with several research groups studying many, often overlapping areas of the world. When studying a particular area, being able to keep track of ones own simulations as well as those of collaborators can be challenging. This paper describes the design, implementation, and evaluation of a new tool for visually cataloguing and retrieving data associated with a given geographical location through a web-based Google Maps interface. Each data repository is pin-pointed on the map with a marker based on the geographical location that the dataset corresponds to. By clicking on the markers, users can quickly inspect the metadata of the repositories and download the associated data files. The crux of the approach lies in the ability to easily query and retrieve data from multiple sources via a common interface. While many advances are being made in terms of scientific data repositories, the development of this new tool has uncovered several issues and limitations of the current state-of-the-art which are discussed herein, along with some ideas for the future.

Digital Libraries

Bursting Scientific Filter Bubbles: Boosting Innovation via Novel Author Discovery

132 - Jason Portenoy , Marissa Radensky , Jevin West 2021

Isolated silos of scientific research and the growing challenge of information overload limit awareness across the literature and hinder innovation. Algorithmic curation and recommendation, which often prioritize relevance, can further reinforce these informational filter bubbles. In response, we describe Bridger, a system for facilitating discovery of scholars and their work, to explore design tradeoffs between relevant and novel recommendations. We construct a faceted representation of authors with information gleaned from their papers and inferred author personas, and use it to develop an approach that locates commonalities (bridges) and contrasts between scientists -- retrieving partially similar authors rather than aiming for strict similarity. In studies with computer science researchers, this approach helps users discover authors considered useful for generating novel research directions, outperforming a state-of-art neural model. In addition to recommending new content, we also demonstrate an approach for displaying it in a manner that boosts researchers ability to understand the work of authors with whom they are unfamiliar. Finally, our analysis reveals that Bridger connects authors who have different citation profiles, publish in different venues, and are more distant in social co-authorship networks, raising the prospect of bridging diverse communities and facilitating discovery.

Digital Libraries Computation and Language Human-Computer Interaction

Characterizing References from Different Disciplines: A Perspective of Citation Content Analysis

155 - Chengzhi Zhang , Lifan Liu , Yuzhuo Wang 2021

Multidisciplinary cooperation is now common in research since social issues inevitably involve multiple disciplines. In research articles, reference information, especially citation content, is an important representation of communication among different disciplines. Analyzing the distribution characteristics of references from different disciplines in research articles is basic to detecting the sources of referred information and identifying contributions of different disciplines. This work takes articles in PLoS as the data and characterizes the references from different disciplines based on Citation Content Analysis (CCA). First, we download 210,334 full-text articles from PLoS and collect the information of the in-text citations. Then, we identify the discipline of each reference in these academic articles. To characterize the distribution of these references, we analyze three characteristics, namely, the number of citations, the average cited intensity and the average citation length. Finally, we conclude that the distributions of references from different disciplines are significantly different. Although most references come from Natural Science, Humanities and Social Sciences play important roles in the Introduction and Background sections of the articles. Basic disciplines, such as Mathematics, mainly provide research methods in the articles in PLoS. Citations mentioned in the Results and Discussion sections of articles are mainly in-discipline citations, such as citations from Nursing and Medicine in PLoS.

Digital Libraries Computation and Language

Using Full-text Content of Academic Articles to Build a Methodology Taxonomy of Information Science in China

110 - Heng Zhang , Chengzhi Zhang 2021

Research on the construction of traditional information science methodology taxonomy is mostly conducted manually. From the limited corpus, researchers have attempted to summarize some of the research methodology entities into several abstract levels (generally three levels); however, they have been unable to provide a more granular hierarchy. Moreover, updating the methodology taxonomy is traditionally a slow process. In this study, we collected full-text academic papers related to information science. First, we constructed a basic methodology taxonomy with three levels by manual annotation. Then, the word vectors of the research methodology entities were trained using the full-text data. Accordingly, the research methodology entities were clustered and the basic methodology taxonomy was expanded using the clustering results to obtain a methodology taxonomy with more levels. This study provides new concepts for constructing a methodology taxonomy of information science. The proposed methodology taxonomy is semi-automated; it is more detailed than conventional schemes and the speed of taxonomy renewal has been enhanced.

Digital Libraries Computation and Language

LODE: Linking Digital Humanities Content to the Web of Data

1029 - Jakob Huber , Timo Sztyler , Jan Noessner 2014

Numerous digital humanities projects maintain their data collections in the form of text, images, and metadata. While data may be stored in many formats, from plain text to XML to relational databases, the use of the resource description framework (RDF) as a standardized representation has gained considerable traction during the last five years. Almost every digital humanities meeting has at least one session concerned with the topic of digital humanities, RDF, and linked data. While most existing work in linked data has focused on improving algorithms for entity matching, the aim of the LinkedHumanities project is to build digital humanities tools that work out of the box, enabling their use by humanities scholars, computer scientists, librarians, and information scientists alike. With this paper, we report on the Linked Open Data Enhancer (LODE) framework developed as part of the LinkedHumanities project. With LODE we support non-technical users to enrich a local RDF repository with high-quality data from the Linked Open Data cloud. LODE links and enhances the local RDF repository without compromising the quality of the data. In particular, LODE supports the user in the enhancement and linking process by providing intuitive user-interfaces and by suggesting high-quality linking candidates using tailored matching algorithms. We hope that the LODE framework will be useful to digital humanities scholars complementing other digital humanities tools.

Digital Libraries

comments

Fetching comments

Damascus University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

QURATOR: Innovative Technologies for Content and Data Curation

Ask ChatGPT about the research

No Arabic abstract

Read More