ترغب بنشر مسار تعليمي؟ اضغط هنا

Case Studies and Challenges in Reproducibility in the Computational Sciences

95   0   0.0 ( 0 )
 نشر من قبل Alexander Konovalov
 تاريخ النشر 2014
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper investigates the reproducibility of computational science research and identifies key challenges facing the community today. It is the result of the First Summer School on Experimental Methodology in Computational Science Research (https://blogs.cs.st-andrews.ac.uk/emcsr2014/). First, we consider how to reproduce experiments that involve human subjects, and in particular how to deal with different ethics requirements at different institutions. Second, we look at whether parallel and distributed computational experiments are more or less reproducible than serial ones. Third, we consider reproducible computational experiments from fields outside computer science. Our final case study looks at whether reproducibility for one researcher is the same as for another, by having an author attempt to have others reproduce their own, reproducible, paper. This paper is open, executable and reproducible: the whole process of writing this paper is captured in the source control repository hosting both the source of the paper, supplementary codes and data; we are providing setup for several experiments on which we were working; finally, we try to describe what we have achieved during the week of the school in a way that others may reproduce (and hopefully improve) our experiments.



قيم البحث

اقرأ أيضاً

The ability to test the nature of dark mass-energy components in the universe through large-scale structure studies hinges on accurate predictions of sky survey expectations within a given world model. Numerical simulations predict key survey signatu res with varying degrees of confidence, limited mainly by the complex astrophysics of galaxy formation. As surveys grow in size and scale, systematic uncertainties in theoretical modeling can become dominant. Dark energy studies will challenge the computational cosmology community to critically assess current techniques, develop new approaches to maximize accuracy, and establish new tools and practices to efficiently employ globally networked computing resources.
This article sets out our perspective on how to begin the journey of decolonising computational fields, such as data and cognitive sciences. We see this struggle as requiring two basic steps: a) realisation that the present-day system has inherited, and still enacts, hostile, conservative, and oppressive behaviours and principles towards women of colour (WoC); and b) rejection of the idea that centering individual people is a solution to system-level problems. The longer we ignore these two steps, the more our academic system maintains its toxic structure, excludes, and harms Black women and other minoritised groups. This also keeps the door open to discredited pseudoscience, like eugenics and physiognomy. We propose that grappling with our fields histories and heritage holds the key to avoiding mistakes of the past. For example, initiatives such as diversity boards can still be harmful because they superficially appear reformatory but nonetheless center whiteness and maintain the status quo. Building on the shoulders of many WoCs work, who have been paving the way, we hope to advance the dialogue required to build both a grass-roots and a top-down re-imagining of computational sciences -- including but not limited to psychology, neuroscience, cognitive science, computer science, data science, statistics, machine learning, and artificial intelligence. We aspire for these fields to progress away from their stagnant, sexist, and racist shared past into carving and maintaining an ecosystem where both a diverse demographics of researchers and scientific ideas that critically challenge the status quo are welcomed.
The scale and scope of scholarly articles today are overwhelming human researchers who seek to timely digest and synthesize knowledge. In this paper, we seek to develop natural language processing (NLP) models to accelerate the speed of extraction of relationships from scholarly papers in social sciences, identify hypotheses from these papers, and extract the cause-and-effect entities. Specifically, we develop models to 1) classify sentences in scholarly documents in business and management as hypotheses (hypothesis classification), 2) classify these hypotheses as causal relationships or not (causality classification), and, if they are causal, 3) extract the cause and effect entities from these hypotheses (entity extraction). We have achieved high performance for all the three tasks using different modeling techniques. Our approach may be generalizable to scholarly documents in a wide range of social sciences, as well as other types of textual materials.
A comparative study is done of interdisciplinary citations in 2013 between physics, chemistry, and molecular biology, in Brazil, South Korea, Turkey, and USA. Several surprising conclusions emerge from our tabular and graphical analysis: The cross-sc ience citation rates are in general strikingly similar, between Brazil, South Korea, Turkey, and USA. One apparent exception is the comparatively more tenuous relation between molecular biology and physics in Brazil and USA. Other slight exceptions are the higher amount of citing of physicists by chemists in South Korea, of chemists by molecular biologists in Turkey, and of molecular biologists by chemists in Brazil and USA. Chemists are, by a sizable margin, the most cross-science citing scientists in this group of three sciences. Physicist are, again by a sizable margin, the least cross-science citing scientists in this group of three sciences. In all four countries, the strongest cross-science citation is from chemistry to physics and the weakest cross-science citation is from physics to molecular biology. Our findings are consistent with a V-shaped backbone connectivity, as opposed to a Delta connectivity, as also found in a previous study of earlier citation years.
Reproducibility in the computational sciences has been stymied because of the complex and rapidly changing computational environments in which modern research takes place. While many will espouse reproducibility as a value, the challenge of making it happen (both for themselves and testing the reproducibility of others work) often outweigh the benefits. There have been a few reproducibility solutions designed and implemented by the community. In particular, the authors are contributors to ReproZip, a tool to enable computational reproducibility by tracing and bundling together research in the environment in which it takes place (e.g. ones computer or server). In this white paper, we introduce a tool for unpacking ReproZip bundles in the cloud, ReproServer. ReproServer takes an uploaded ReproZip bundle (.rpz file) or a link to a ReproZip bundle, and users can then unpack them in the cloud via their browser, allowing them to reproduce colleagues work without having to install anything locally. This will help lower the barrier to reproducing others work, which will aid reviewers in verifying the claims made in papers and reusing previously published research.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا