ترغب بنشر مسار تعليمي؟ اضغط هنا

A Collaborative Approach to Computational Reproducibility

94   0   0.0 ( 0 )
 نشر من قبل Remi Rampin
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Although a standard in natural science, reproducibility has been only episodically applied in experimental computer science. Scientific papers often present a large number of tables, plots and pictures that summarize the obtained results, but then loosely describe the steps taken to derive them. Not only can the methods and the implementation be complex, but also their configuration may require setting many parameters and/or depend on particular system configurations. While many researchers recognize the importance of reproducibility, the challenge of making it happen often outweigh the benefits. Fortunately, a plethora of reproducibility solutions have been recently designed and implemented by the community. In particular, packaging tools (e.g., ReproZip) and virtualization tools (e.g., Docker) are promising solutions towards facilitating reproducibility for both authors and reviewers. To address the incentive problem, we have implemented a new publication model for the Reproducibility Section of Information Systems Journal. In this section, authors submit a reproducibility paper that explains in detail the computational assets from a previous published manuscript in Information Systems.



قيم البحث

اقرأ أيضاً

361 - Mat Kelly 2020
This paper presents a use case exploring the application of the Archival Resource Key (ARK) persistent identifier for promoting and maintaining ontologies. In particular, we look at improving computation with an in-house ontology server in the contex t of temporally aligned vocabularies. This effort demonstrates the utility of ARKs in preparing historical ontologies for computational archival science.
Recent reproducibility case studies have raised concerns showing that much of the deposited research has not been reproducible. One of their conclusions was that the way data repositories store research data and code cannot fully facilitate reproduci bility due to the absence of a runtime environment needed for the code execution. New specialized reproducibility tools provide cloud-based computational environments for code encapsulation, thus enabling research portability and reproducibility. However, they do not often enable research discoverability, standardized data citation, or long-term archival like data repositories do. This paper addresses the shortcomings of data repositories and reproducibility tools and how they could be overcome to improve the current lack of computational reproducibility in published and archived research outputs.
We present an overview of the recently funded Merging Science and Cyberinfrastructure Pathways: The Whole Tale project (NSF award #1541450). Our approach has two nested goals: 1) deliver an environment that enables researchers to create a complete na rrative of the research process including exposure of the data-to-publication lifecycle, and 2) systematically and persistently link research publications to their associated digital scholarly objects such as the data, code, and workflows. To enable this, Whole Tale will create an environment where researchers can collaborate on data, workspaces, and workflows and then publish them for future adoption or modification. Published data and applications will be consumed either directly by users using the Whole Tale environment or can be integrated into existing or future domain Science Gateways.
109 - V. K. Ivanov 2021
The article considers the quantitative assessment approach to the innovativeness of different objects. The proposed assessment model is based on the object data retrieval from various databases including the Internet. We present an object linguistic model, the processing technique for the measurement results including the results retrieved from the different search engines, and the evaluating technique of the source credibility. Empirical research of the computational model adequacy includes the acquisition and preprocessing of patent data from different databases and the computation of invention innovativeness values: their novelty and relevance. The experiment results, namely the comparative assessments of innovativeness values and major trends, show the models developed are sufficiently adequate and can be used in further research.
Analysis pipelines commonly use high-level technologies that are popular when created, but are unlikely to be readable, executable, or sustainable in the long term. A set of criteria is introduced to address this problem: Completeness (no execution r equirement beyond a minimal Unix-like operating system, no administrator privileges, no network connection, and storage primarily in plain text); modular design; minimal complexity; scalability; verifiable inputs and outputs; version control; linking analysis with narrative; and free software. As a proof of concept, we introduce Maneage (Managing data lineage), enabling cheap archiving, provenance extraction, and peer verification that been tested in several research publications. We show that longevity is a realistic requirement that does not sacrifice immediate or short-term reproducibility. The caveats (with proposed solutions) are then discussed and we conclude with the benefits for the various stakeholders. This paper is itself written with Maneage (project commit eeff5de).
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا