Prediction of gene expression time series and structural analysis of gene regulatory networks using recurrent neural networks

132 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Michele Monti

تاريخ النشر 2021

مجال البحث فيزياء الهندسة المعلوماتية

والبحث باللغة English

تأليف Michele Monti - Jonathan Fiorentino - Edoardo Milanetti

الفيزياء البيولوجية التعلم الآلي تحليل البيانات والإحصاءات والاحتمال

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Methods for time series prediction and classification of gene regulatory networks (GRNs) from gene expression data have been treated separately so far. The recent emergence of attention-based recurrent neural networks (RNN) models boosted the interpretability of RNN parameters, making them appealing for the understanding of gene interactions. In this work, we generated synthetic time series gene expression data from a range of archetypal GRNs and we relied on a dual attention RNN to predict the gene temporal dynamics. We show that the prediction is extremely accurate for GRNs with different architectures. Next, we focused on the attention mechanism of the RNN and, using tools from graph theory, we found that its graph properties allow to hierarchically distinguish different architectures of the GRN. We show that the GRNs respond differently to the addition of noise in the prediction by the RNN and we relate the noise response to the analysis of the attention mechanism. In conclusion, this work provides a a way to understand and exploit the attention mechanism of RNN and it paves the way to RNN-based methods for time series prediction and inference of GRNs from gene expression data.

قيم البحث

اقرأ أيضاً

Stability of gene regulatory networks

85 - Yipei Guo , Ariel Amir 2020

Homeostasis of protein concentrations in cells is crucial for their proper functioning, and this requires concentrations (at their steady-state levels) to be stable to fluctuations. Since gene expression is regulated by proteins such as transcription factors (TFs), the full set of proteins within the cell constitutes a large system of interacting components. Here, we explore factors affecting the stability of this system by coupling the dynamics of mRNAs and protein concentrations in a growing cell. We find that it is possible for protein concentrations to become unstable if the regulation strengths or system size becomes too large, and that other global structural features of the networks can dramatically enhance the stability of the system. In particular, given the same number of proteins, TFs, number of interactions, and regulation strengths, a network that resembles a bipartite graph with a lower fraction of interactions that target TFs has a higher chance of being stable. By scrambling the $textit{E. coli.}$ transcription network, we find that the randomized network with the same number of regulatory interactions is much more likely to be unstable than the real network. These findings suggest that constraints imposed by system stability could have played a role in shaping the existing regulatory network during the evolutionary process. We also find that contrary to what one might expect from random matrix theory and what has been argued in the literature, the degradation rate of mRNA does not affect whether the system is stable.

الفيزياء البيولوجية الأنظمة المضطربة والشبكات العصبية الشبكات الجزيئية

Topological effects of data incompleteness of gene regulatory networks

610 - J. Sanz , E.Cozzo , J. Borge-Holthoefer 2012

The topological analysis of biological networks has been a prolific topic in network science during the last decade. A persistent problem with this approach is the inherent uncertainty and noisy nature of the data. One of the cases in which this situ ation is more marked is that of transcriptional regulatory networks (TRNs) in bacteria. The datasets are incomplete because regulatory pathways associated to a relevant fraction of bacterial genes remain unknown. Furthermore, direction, strengths and signs of the links are sometimes unknown or simply overlooked. Finally, the experimental approaches to infer the regulations are highly heterogeneous, in a way that induces the appearance of systematic experimental-topological correlations. And yet, the quality of the available data increases constantly. In this work we capitalize on these advances to point out the influence of data (in)completeness and quality on some classical results on topological analysis of TRNs, specially regarding modularity at different levels. In doing so, we identify the most relevant factors affecting the validity of previous findings, highlighting important caveats to future prokaryotic TRNs topological analysis.

الفيزياء البيولوجية الفيزياء والمجتمع الشبكات الجزيئية

Chaotic Time Series Prediction using Spatio-Temporal RBF Neural Networks

90 - Alishba Sadiq , Muhammad Sohail Ibrahim , Muhammad Usman 2019

Due to the dynamic nature, chaotic time series are difficult predict. In conventional signal processing approaches signals are treated either in time or in space domain only. Spatio-temporal analysis of signal provides more advantages over convention al uni-dimensional approaches by harnessing the information from both the temporal and spatial domains. Herein, we propose an spatio-temporal extension of RBF neural networks for the prediction of chaotic time series. The proposed algorithm utilizes the concept of time-space orthogonality and separately deals with the temporal dynamics and spatial non-linearity(complexity) of the chaotic series. The proposed RBF architecture is explored for the prediction of Mackey-Glass time series and results are compared with the standard RBF. The spatio-temporal RBF is shown to out perform the standard RBFNN by achieving significantly reduced estimation error.

التعلم الالي التعلم الآلي تحليل البيانات والإحصاءات والاحتمال

Memory functions reveal structural properties of gene regulatory networks

128 - Edgar Herrera-Delgado 2017

Gene regulatory networks (GRNs) control cellular function and decision making during tissue development and homeostasis. Mathematical tools based on dynamical systems theory are often used to model these networks, but the size and complexity of these models mean that their behaviour is not always intuitive and the underlying mechanisms can be difficult to decipher. For this reason, methods that simplify and aid exploration of complex networks are necessary. To this end we develop a broadly applicable form of the Zwanzig-Mori projection. By first converting a thermodynamic state ensemble model of gene regulation into mass action reactions we derive a general method that produces a set of time evolution equations for a subset of components of a network. The influence of the rest of the network, the bulk, is captured by memory functions that describe how the subnetwork reacts to its own past state via components in the bulk. These memory functions provide probes of near-steady state dynamics, revealing information not easily accessible otherwise. We illustrate the method on a simple cross-repressive transcriptional motif to show that memory functions not only simplify the analysis of the subnetwork but also have a natural interpretation. We then apply the approach to a GRN from the vertebrate neural tube, a well characterised developmental transcriptional network composed of four interacting transcription factors. The memory functions reveal the function of specific links within the neural tube network and identify features of the regulatory structure that specifically increase the robustness of the network to initial conditions. Taken together, the study provides evidence that Zwanzig-Mori projections offer powerful and effective tools for simplifying and exploring the behaviour of GRNs.

الشبكات الجزيئية

Predicting Toxicity from Gene Expression with Neural Networks

156 - Peter Eastman , Vijay S. Pande 2019

We train a neural network to predict chemical toxicity based on gene expression data. The input to the network is a full expression profile collected either in vitro from cultured cells or in vivo from live animals. The output is a set of fine graine d predictions for the presence of a variety of pathological effects in treated animals. When trained on the Open TG-GATEs database it produces good results, outperforming classical models trained on the same data. This is a promising approach for efficiently screening chemicals for toxic effects, and for more accurately evaluating drug candidates based on preclinical data.

الجينوم