A Collaborative Approach to Computational Reproducibility

94 0 0.0 ( 0 )

Download Cite

Added by Remi Rampin

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Fernando Chirigati - Rebecca Capone - Dennis Shasha

Digital Libraries

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Although a standard in natural science, reproducibility has been only episodically applied in experimental computer science. Scientific papers often present a large number of tables, plots and pictures that summarize the obtained results, but then loosely describe the steps taken to derive them. Not only can the methods and the implementation be complex, but also their configuration may require setting many parameters and/or depend on particular system configurations. While many researchers recognize the importance of reproducibility, the challenge of making it happen often outweigh the benefits. Fortunately, a plethora of reproducibility solutions have been recently designed and implemented by the community. In particular, packaging tools (e.g., ReproZip) and virtualization tools (e.g., Docker) are promising solutions towards facilitating reproducibility for both authors and reviewers. To address the incentive problem, we have implemented a new publication model for the Reproducibility Section of Information Systems Journal. In this section, authors submit a reproducibility paper that explains in detail the computational assets from a previous published manuscript in Information Systems.

rate research

A Computational Approach to Historical Ontologies

361 - Mat Kelly 2020

This paper presents a use case exploring the application of the Archival Resource Key (ARK) persistent identifier for promoting and maintaining ontologies. In particular, we look at improving computation with an in-house ontology server in the context of temporally aligned vocabularies. This effort demonstrates the utility of ARKs in preparing historical ontologies for computational archival science.

Digital Libraries Information Retrieval

Advancing computational reproducibility in the Dataverse data repository platform

110 - Ana Trisovic , Philip Durbin , Tania Schlatter 2020

Recent reproducibility case studies have raised concerns showing that much of the deposited research has not been reproducible. One of their conclusions was that the way data repositories store research data and code cannot fully facilitate reproducibility due to the absence of a runtime environment needed for the code execution. New specialized reproducibility tools provide cloud-based computational environments for code encapsulation, thus enabling research portability and reproducibility. However, they do not often enable research discoverability, standardized data citation, or long-term archival like data repositories do. This paper addresses the shortcomings of data repositories and reproducibility tools and how they could be overcome to improve the current lack of computational reproducibility in published and archived research outputs.

Digital Libraries Software Engineering

Capturing the Whole Tale of Computational Research: Reproducibility in Computing Environments

124 - Bertram Ludaescher , Kyle Chard , Niall Gaffney 2016

We present an overview of the recently funded Merging Science and Cyberinfrastructure Pathways: The Whole Tale project (NSF award #1541450). Our approach has two nested goals: 1) deliver an environment that enables researchers to create a complete narrative of the research process including exposure of the data-to-publication lifecycle, and 2) systematically and persistently link research publications to their associated digital scholarly objects such as the data, code, and workflows. To enable this, Whole Tale will create an environment where researchers can collaborate on data, workspaces, and workflows and then publish them for future adoption or modification. Published data and applications will be consumed either directly by users using the Whole Tale environment or can be integrated into existing or future domain Science Gateways.

Digital Libraries

Computational Model to Quantify Object Innovativeness

109 - V. K. Ivanov 2021

The article considers the quantitative assessment approach to the innovativeness of different objects. The proposed assessment model is based on the object data retrieval from various databases including the Internet. We present an object linguistic model, the processing technique for the measurement results including the results retrieved from the different search engines, and the evaluating technique of the source credibility. Empirical research of the computational model adequacy includes the acquisition and preprocessing of patent data from different databases and the computation of invention innovativeness values: their novelty and relevance. The experiment results, namely the comparative assessments of innovativeness values and major trends, show the models developed are sufficiently adequate and can be used in further research.

Digital Libraries

Towards Long-term and Archivable Reproducibility

61 - Mohammad Akhlaghi , Raul Infante-Sainz , Boudewijn F. Roukema 2020

Analysis pipelines commonly use high-level technologies that are popular when created, but are unlikely to be readable, executable, or sustainable in the long term. A set of criteria is introduced to address this problem: Completeness (no execution requirement beyond a minimal Unix-like operating system, no administrator privileges, no network connection, and storage primarily in plain text); modular design; minimal complexity; scalability; verifiable inputs and outputs; version control; linking analysis with narrative; and free software. As a proof of concept, we introduce Maneage (Managing data lineage), enabling cheap archiving, provenance extraction, and peer verification that been tested in several research publications. We show that longevity is a realistic requirement that does not sacrifice immediate or short-term reproducibility. The caveats (with proposed solutions) are then discussed and we conclude with the benefits for the various stakeholders. This paper is itself written with Maneage (project commit eeff5de).

Digital Libraries