بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Algebraic Model Selection and Experimental Design in Biological Data Science

52 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Brandilyn Stigler

تاريخ النشر 2021

مجال البحث علم الأحياء

والبحث باللغة English

تأليف Anyu Zhang - Jingzhen Hu - Qingzhong Liang

الهندسة الجبرية الأساليب الكمية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Design of experiments and model selection, though essential steps in data science, are usually viewed as unrelated processes in the study and analysis of biological networks. Not accounting for their inter-relatedness has the potential to introduce bias and increase the risk of missing salient features in the modeling process. We propose a data-driven computational framework to unify experimental design and model selection for discrete data sets and minimal polynomial models. We use a special affine transformation, called a linear shift, to provide both the data sets and the polynomial terms that form a basis for a model. This framework enables us to address two important questions that arise in biological data science research: finding the data which identify a set of known interactions and finding identifiable interactions given a set of data. We present the theoretical foundation for a web-accessible database. As an example, we apply this methodology to a previously constructed pharmacodynamic model of epidermal derived growth factor receptor (EGFR) signaling.

قيم البحث

445 - Jan Hasenauer , Nick Jagiella , Sabrina Hross 2015

Biological processes involve a variety of spatial and temporal scales. A holistic understanding of many biological processes therefore requires multi-scale models which capture the relevant properties on all these scales. In this manuscript we review mathematical modelling approaches used to describe the individual spatial scales and how they are integrated into holistic models. We discuss the relation between spatial and temporal scales and the implication of that on multi-scale modelling. Based upon this overview over state-of-the-art modelling approaches, we formulate key challenges in mathematical and computational modelling of biological multi-scale and multi-physics processes. In particular, we considered the availability of analysis tools for multi-scale models and model-based multi-scale data integration. We provide a compact review of methods for model-based data integration and model-based hypothesis testing. Furthermore, novel approaches and recent trends are discussed, including computation time reduction using reduced order and surrogate models, which contribute to the solution of inference problems. We conclude the manuscript by providing a few ideas for the development of tailored multi-scale inference methods.

الشبكات الجزيئية الأساليب الكمية

Statistical model selection methods applied to biological networks

103 - M.P.H. Stumpf , P.J. Ingram , I. Nouvel 2005

Many biological networks have been labelled scale-free as their degree distribution can be approximately described by a powerlaw distribution. While the degree distribution does not summarize all aspects of a network it has often been suggested that its functional form contains important clues as to underlying evolutionary processes that have shaped the network. Generally determining the appropriate functional form for the degree distribution has been fitted in an ad-hoc fashion. Here we apply formal statistical model selection methods to determine which functional form best describes degree distributions of protein interaction and metabolic networks. We interpret the degree distribution as belonging to a class of probability models and determine which of these models provides the best description for the empirical data using maximum likelihood inference, composite likelihood methods, the Akaike information criterion and goodness-of-fit tests. The whole data is used in order to determine the parameter that best explains the data under a given model (e.g. scale-free or random graph). As we will show, present protein interaction and metabolic network data from different organisms suggests that simple scale-free models do not provide an adequate description of real network data.

الشبكات الجزيئية علم الأحياء الكمي

Optimal Experimental Design for Mathematical Models of Hematopoiesis

68 - Luis Martinez Lomeli , Abdon Iniguez , Babak Shahbaba 2020

The hematopoietic system has a highly regulated and complex structure in which cells are organized to successfully create and maintain new blood cells. Feedback regulation is crucial to tightly control this system, but the specific mechanisms by whic h control is exerted are not completely understood. In this work, we aim to uncover the underlying mechanisms in hematopoiesis by conducting perturbation experiments, where animal subjects are exposed to an external agent in order to observe the system response and evolution. Developing a proper experimental design for these studies is an extremely challenging task. To address this issue, we have developed a novel Bayesian framework for optimal design of perturbation experiments. We model the numbers of hematopoietic stem and progenitor cells in mice that are exposed to a low dose of radiation. We use a differential equations model that accounts for feedback and feedforward regulation. A significant obstacle is that the experimental data are not longitudinal, rather each data point corresponds to a different animal. This model is embedded in a hierarchical framework with latent variables that capture unobserved cellular population levels. We select the optimum design based on the amount of information gain, measured by the Kullback-Leibler divergence between the probability distributions before and after observing the data. We evaluate our approach using synthetic and experimental data. We show that a proper design can lead to better estimates of model parameters even with relatively few subjects. Additionally, we demonstrate that the model parameters show a wide range of sensitivities to design options. Our method should allow scientists to find the optimal design by focusing on their specific parameters of interest and provide insight to hematopoiesis. Our approach can be extended to more complex models where latent components are used.

المنهجية الأساليب الكمية تطبيقات الإحصاء

Autofocused oracles for model-based design

64 - Clara Fannjiang , Jennifer Listgarten 2020

Data-driven design is making headway into a number of application areas, including protein, small-molecule, and materials engineering. The design goal is to construct an object with desired properties, such as a protein that binds to a therapeutic ta rget, or a superconducting material with a higher critical temperature than previously observed. To that end, costly experimental measurements are being replaced with calls to high-capacity regression models trained on labeled data, which can be leveraged in an in silico search for design candidates. However, the design goal necessitates moving into regions of the design space beyond where such models were trained. Therefore, one can ask: should the regression model be altered as the design algorithm explores the design space, in the absence of new data? Herein, we answer this question in the affirmative. In particular, we (i) formalize the data-driven design problem as a non-zero-sum game, (ii) develop a principled strategy for retraining the regression model as the design algorithm proceeds---what we refer to as autofocusing, and (iii) demonstrate the promise of autofocusing empirically.

التعلم الآلي الأساليب الكمية التعلم الالي

Modeling biological systems with delays in Bio-PEPA

359 - Giulio Caravagna 2010

Delays in biological systems may be used to model events for which the underlying dynamics cannot be precisely observed, or to provide abstraction of some behavior of the system resulting more compact models. In this paper we enrich the stochastic pr ocess algebra Bio-PEPA, with the possibility of assigning delays to actions, yielding a new non-Markovian process algebra: Bio-PEPAd. This is a conservative extension meaning that the original syntax of Bio-PEPA is retained and the delay specification which can now be associated with actions may be added to existing Bio-PEPA models. The semantics of the firing of the actions with delays is the delay-as-duration approach, earlier presented in papers on the stochastic simulation of biological systems with delays. These semantics of the algebra are given in the Starting-Terminating style, meaning that the state and the completion of an action are observed as two separate events, as required by delays. Furthermore we outline how to perform stochastic simulation of Bio-PEPAd systems and how to automatically translate a Bio-PEPAd system into a set of Delay Differential Equations, the deterministic framework for modeling of biological systems with delays. We end the paper with two example models of biological systems with delays to illustrate the approach.

الهندسة الحاسوبية، المالية،العلوم الأساليب الكمية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة المأمون الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Algebraic Model Selection and Experimental Design in Biological Data Science

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً