Sharing and Preserving Computational Analyses for Posterity with encapsulator

57 0 0.0 ( 0 )

Download Cite

Added by Thomas Pasquier

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Thomas Pasquier - Matthew K. Lau - Xueyuan Han

Digital Libraries

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Open data and open-source software may be part of the solution to sciences reproducibility crisis, but they are insufficient to guarantee reproducibility. Requiring minimal end-user expertise, encapsulator creates a time capsule with reproducible code in a self-contained computational environment. encapsulator provides end-users with a fully-featured desktop environment for reproducible research.

rate research

Developing a Robust Migration Workflow for Preserving and Curating Hand-held Media

571 - Angela Dappert , Andrew N. Jackson , Akiko Kimura 2013

Many memory institutions hold large collections of hand-held media, which can comprise hundreds of terabytes of data spread over many thousands of data-carriers. Many of these carriers are at risk of significant physical degradation over time, depending on their composition. Unfortunately, handling them manually is enormously time consuming and so a full and frequent evaluation of their condition is extremely expensive. It is, therefore, important to develop scalable processes for stabilizing them onto backed-up online storage where they can be subject to highquality digital preservation management. This goes hand in hand with the need to establish efficient, standardized ways of recording metadata and to deal with defective data-carriers. This paper discusses processing approaches, workflows, technical set-up, software solutions and touches on staffing needs for the stabilization process. We have experimented with different disk copying robots, defined our metadata, and addressed storage issues to scale stabilization to the vast quantities of digital objects on hand-held data-carriers that need to be preserved. Working closely with the content curators, we have been able to build a robust data migration workflow and have stabilized over 16 terabytes of data in a scalable and economical manner.

Digital Libraries

Computational Analyses of Arabic Morphology

52 - George A. Kiraz 1994

This paper demonstrates how a (multi-tape) two-level formalism can be used to write two-level grammars for Arabic non-linear morphology using a high level, but computationally tractable, notation. Three illustrative grammars are provided based on CV-, moraic- and affixational analyses. These are complemented by a proposal for handling the hitherto computationally untreated problem of the broken plural. It will be shown that the best grammars for describing Arabic non-linear morphology are moraic in the case of templatic stems, and affixational in the case of a-templatic stems. The paper will demonstrate how the broken plural can be derived under two-level theory via the `implicit derivation of the singular.

Computation and Language

Who pays? Comparing cost sharing models for a Gold Open Access publication environment

71 - Andre Bruns , Christine Rimmert , Niels Taubert 2020

The article focuses on possible financial effects of the transformation towards Gold Open Access publishing based on article processing charges and studies an aspect that has so far been overlooked: Do possible cost sharing models lead to the same overall expenses or do they result in different financial burdens for the research institutions involved? It takes the current state of Gold OA publishing as a starting point, develops five possible models of attributing costs based on different author roles, number of authors and author-address-combinations. The analysis of the distributional effects of the application of the different models shows that all models result in similar expenditures for the overwhelming majority of institutions. Still, there are some research institutions where the difference between most and least expensive model results in a considerable amount of money. Given that the model calculation only considers publications that are Open Access and where all authors come from Germany, it is likely that different cost sharing models will become an issue in the debate on how to shoulder a possible large scale transformation towards Open Access based on publication fees.

Digital Libraries

A Collaborative Approach to Computational Reproducibility

93 - Fernando Chirigati , Rebecca Capone , Dennis Shasha 2017

Although a standard in natural science, reproducibility has been only episodically applied in experimental computer science. Scientific papers often present a large number of tables, plots and pictures that summarize the obtained results, but then loosely describe the steps taken to derive them. Not only can the methods and the implementation be complex, but also their configuration may require setting many parameters and/or depend on particular system configurations. While many researchers recognize the importance of reproducibility, the challenge of making it happen often outweigh the benefits. Fortunately, a plethora of reproducibility solutions have been recently designed and implemented by the community. In particular, packaging tools (e.g., ReproZip) and virtualization tools (e.g., Docker) are promising solutions towards facilitating reproducibility for both authors and reviewers. To address the incentive problem, we have implemented a new publication model for the Reproducibility Section of Information Systems Journal. In this section, authors submit a reproducibility paper that explains in detail the computational assets from a previous published manuscript in Information Systems.

Digital Libraries

A Computational Approach to Historical Ontologies

361 - Mat Kelly 2020

This paper presents a use case exploring the application of the Archival Resource Key (ARK) persistent identifier for promoting and maintaining ontologies. In particular, we look at improving computation with an in-house ontology server in the context of temporally aligned vocabularies. This effort demonstrates the utility of ARKs in preparing historical ontologies for computational archival science.

Digital Libraries Information Retrieval