مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Structuring research methods and data with the Research Object model: genomics workflows as a case study

136 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Stian Soiland-Reyes

تاريخ النشر 2013

مجال البحث علم الأحياء الهندسة المعلوماتية

والبحث باللغة English

تأليف Kristina M. Hettne - Harish Dharuri - Jun Zhao

الجينوم المكتبات الرقمية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

One of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinformatics community. Our assumption is that offering means of digital, structured aggregation and annotation of the objects of an experiment will provide necessary meta-data for a scientist to understand and recreate the results of an experiment. To support this we explored a model for the semantic description of a workflow-centric Research Object (RO), where an RO is defined as a resource that aggregates other resources, e.g., datasets, software, spreadsheets, text, etc. We applied this model to a case study where we analysed human metabolite variation by workflows.

قيم البحث

144 - Timm Fitschen , Alexander Schlemmer , Daniel Hornung 2018

Here we present CaosDB, a Research Data Management System (RDMS) designed to ensure seamless integration of inhomogeneous data sources and repositories of legacy data. Its primary purpose is the management of data from biomedical sciences, both from simulations and experiments during the complete research data lifecycle. An RDMS for this domain faces particular challenges: Research data arise in huge amounts, from a wide variety of sources, and traverse a highly branched path of further processing. To be accepted by its users, an RDMS must be built around workflows of the scientists and practices and thus support changes in workflow and data structure. Nevertheless it should encourage and support the development and observation of standards and furthermore facilitate the automation of data acquisition and processing with specialized software. The storage data model of an RDMS must reflect these complexities with appropriate semantics and ontologies while offering simple methods for finding, retrieving, and understanding relevant data. We show how CaosDB responds to these challenges and give an overview of the CaosDB Server, its data model and its easy-to-learn CaosDB Query Language. We briefly discuss the status of the implementation, how we currently use CaosDB, and how we plan to use and extend it.

قواعد البيانات الذكاء الاصطناعي

A large-scale study on research code quality and execution

309 - Ana Trisovic , Matthew K. Lau , Thomas Pasquier 2021

This article presents a study on the quality and execution of research code from publicly-available replication datasets at the Harvard Dataverse repository. Research code is typically created by a group of scientists and published together with acad emic papers to facilitate research transparency and reproducibility. For this study, we define ten questions to address aspects impacting research reproducibility and reuse. First, we retrieve and analyze more than 2000 replication datasets with over 9000 unique R files published from 2010 to 2020. Second, we execute the code in a clean runtime environment to assess its ease of reuse. Common coding errors were identified, and some of them were solved with automatic code cleaning to aid code execution. We find that 74% of R files crashed in the initial execution, while 56% crashed when code cleaning was applied, showing that many errors can be prevented with good coding practices. We also analyze the replication datasets from journals collections and discuss the impact of the journal policy strictness on the code re-execution rate. Finally, based on our results, we propose a set of recommendations for code dissemination aimed at researchers, journals, and repositories.

هندسة البرمجيات المكتبات الرقمية

Using Supervised Learning to Classify Metadata of Research Data by Discipline of Research

113 - Tobias Weber , Dieter Kranzlmuller , Michael Fromm 2019

Automated classification of metadata of research data by their discipline(s) of research can be used in scientometric research, by repository service providers, and in the context of research data aggregation services. Openly available metadata of th e DataCite index for research data were used to compile a large training and evaluation set comprised of 609,524 records, which is published alongside this paper. These data allow to reproducibly assess classification approaches, such as tree-based models and neural networks. According to our experiments with 20 base classes (multi-label classification), multi-layer perceptron models perform best with a f1-macro score of 0.760 closely followed by Long Short-Term Memory models (f1-macro score of 0.755). A possible application of the trained classification models is the quantitative analysis of trends towards interdisciplinarity of digital scholarly output or the characterization of growth patterns of research data, stratified by discipline of research. Both applications perform at scale with the proposed models which are available for re-use.

استرجاع المعلومات المكتبات الرقمية التعلم الآلي

Operational Research Literature as a Use Case for the Open Research Knowledge Graph

206 - Mila Runnwerth , Markus Stocker , Soren Auer 2020

The Open Research Knowledge Graph (ORKG) provides machine-actionable access to scholarly literature that habitually is written in prose. Following the FAIR principles, the ORKG makes traditional, human-coded knowledge findable, accessible, interopera ble, and reusable in a structured manner in accordance with the Linked Open Data paradigm. At the moment, in ORKG papers are described manually, but in the long run the semantic depth of the literature at scale needs automation. Operational Research is a suitable test case for this vision because the mathematical field and, hence, its publication habits are highly structured: A mundane problem is formulated as a mathematical model, solved or approximated numerically, and evaluated systematically. We study the existing literature with respect to the Assembly Line Balancing Problem and derive a semantic description in accordance with the ORKG. Eventually, selected papers are ingested to test the semantic description and refine it further.

المكتبات الرقمية

Delivering Scientific Influence Analysis as a Service on Research Grants Repository

119 - Yuming Wang , Yanbo Long , Lai Tu 2019

Research grants have played an important role in seeding and promoting fundamental research projects worldwide. There is a growing demand for developing and delivering scientific influence analysis as a service on research grant repositories. Such an alysis can provide insight on how research grants help foster new research collaborations, encourage cross-organizational collaborations, influence new research trends, and identify technical leadership. This paper presents the design and development of a grants-based scientific influence analysis service, coined as GImpact. It takes a graph-theoretic approach to design and develop large scale scientific influence analysis over a large research-grant repository with three original contributions. First, we mine the grant database to identify and extract important features for grants influence analysis and represent such features using graph theoretic models. For example, we extract an institution graph and multiple associated aspect-based collaboration graphs, including a discipline graph and a keyword graph. Second, we introduce self-influence and co-influence algorithms to compute two types of collaboration relationship scores based on the number of grants and the types of grants for institutions. We compute the self-influence scores to reflect the grant based research collaborations among institutions and compute multiple co-influence scores to model the various types of cross-institution collaboration relationships in terms of disciplines and subject areas. Third, we compute the overall scientific influence score for every pair of institutions by introducing a weighted sum of the self-influence score and the multiple co-influence scores and conduct an influence-based clustering analysis. We evaluate GImpact using a real grant database, consisting of 2512 institutions and their grants received over a period of 14 years...

الشبكات الاجتماعية والمعلومات المكتبات الرقمية الفيزياء والمجتمع

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة بابل

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Structuring research methods and data with the Research Object model: genomics workflows as a case study

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً