No Arabic abstract
The topological analysis of biological networks has been a prolific topic in network science during the last decade. A persistent problem with this approach is the inherent uncertainty and noisy nature of the data. One of the cases in which this situation is more marked is that of transcriptional regulatory networks (TRNs) in bacteria. The datasets are incomplete because regulatory pathways associated to a relevant fraction of bacterial genes remain unknown. Furthermore, direction, strengths and signs of the links are sometimes unknown or simply overlooked. Finally, the experimental approaches to infer the regulations are highly heterogeneous, in a way that induces the appearance of systematic experimental-topological correlations. And yet, the quality of the available data increases constantly. In this work we capitalize on these advances to point out the influence of data (in)completeness and quality on some classical results on topological analysis of TRNs, specially regarding modularity at different levels. In doing so, we identify the most relevant factors affecting the validity of previous findings, highlighting important caveats to future prokaryotic TRNs topological analysis.
Homeostasis of protein concentrations in cells is crucial for their proper functioning, and this requires concentrations (at their steady-state levels) to be stable to fluctuations. Since gene expression is regulated by proteins such as transcription factors (TFs), the full set of proteins within the cell constitutes a large system of interacting components. Here, we explore factors affecting the stability of this system by coupling the dynamics of mRNAs and protein concentrations in a growing cell. We find that it is possible for protein concentrations to become unstable if the regulation strengths or system size becomes too large, and that other global structural features of the networks can dramatically enhance the stability of the system. In particular, given the same number of proteins, TFs, number of interactions, and regulation strengths, a network that resembles a bipartite graph with a lower fraction of interactions that target TFs has a higher chance of being stable. By scrambling the $textit{E. coli.}$ transcription network, we find that the randomized network with the same number of regulatory interactions is much more likely to be unstable than the real network. These findings suggest that constraints imposed by system stability could have played a role in shaping the existing regulatory network during the evolutionary process. We also find that contrary to what one might expect from random matrix theory and what has been argued in the literature, the degradation rate of mRNA does not affect whether the system is stable.
We analyze the gene expression data of Zebrafish under the combined framework of complex networks and random matrix theory. The nearest neighbor spacing distribution of the corresponding matrix spectra follows random matrix predictions of Gaussian orthogonal statistics. Based on the eigenvector analysis we can divide the spectra into two parts, first part for which the eigenvector localization properties match with the random matrix theory predictions, and the second part for which they show deviation from the theory and hence are useful to understand the system dependent properties. Spectra with the localized eigenvectors can be characterized into three groups based on the eigenvalues. We explore the position of localized nodes from these different categories. Using an overlap measure, we find that the top contributing nodes in the different groups carry distinguished structural features. Furthermore, the top contributing nodes of the different localized eigenvectors corresponding to the lower eigenvalue regime form different densely connected structure well separated from each other. Preliminary biological interpretation of the genes, associated with the top contributing nodes in the localized eigenvectors, suggests that the genes corresponding to same vector share common features.
Methods for time series prediction and classification of gene regulatory networks (GRNs) from gene expression data have been treated separately so far. The recent emergence of attention-based recurrent neural networks (RNN) models boosted the interpretability of RNN parameters, making them appealing for the understanding of gene interactions. In this work, we generated synthetic time series gene expression data from a range of archetypal GRNs and we relied on a dual attention RNN to predict the gene temporal dynamics. We show that the prediction is extremely accurate for GRNs with different architectures. Next, we focused on the attention mechanism of the RNN and, using tools from graph theory, we found that its graph properties allow to hierarchically distinguish different architectures of the GRN. We show that the GRNs respond differently to the addition of noise in the prediction by the RNN and we relate the noise response to the analysis of the attention mechanism. In conclusion, this work provides a a way to understand and exploit the attention mechanism of RNN and it paves the way to RNN-based methods for time series prediction and inference of GRNs from gene expression data.
Gene regulatory networks (GRNs) control cellular function and decision making during tissue development and homeostasis. Mathematical tools based on dynamical systems theory are often used to model these networks, but the size and complexity of these models mean that their behaviour is not always intuitive and the underlying mechanisms can be difficult to decipher. For this reason, methods that simplify and aid exploration of complex networks are necessary. To this end we develop a broadly applicable form of the Zwanzig-Mori projection. By first converting a thermodynamic state ensemble model of gene regulation into mass action reactions we derive a general method that produces a set of time evolution equations for a subset of components of a network. The influence of the rest of the network, the bulk, is captured by memory functions that describe how the subnetwork reacts to its own past state via components in the bulk. These memory functions provide probes of near-steady state dynamics, revealing information not easily accessible otherwise. We illustrate the method on a simple cross-repressive transcriptional motif to show that memory functions not only simplify the analysis of the subnetwork but also have a natural interpretation. We then apply the approach to a GRN from the vertebrate neural tube, a well characterised developmental transcriptional network composed of four interacting transcription factors. The memory functions reveal the function of specific links within the neural tube network and identify features of the regulatory structure that specifically increase the robustness of the network to initial conditions. Taken together, the study provides evidence that Zwanzig-Mori projections offer powerful and effective tools for simplifying and exploring the behaviour of GRNs.
The two most fundamental processes describing change in biology, development and evolu-tion, occur over drastically different timescales, difficult to reconcile within a unified framework. Development involves temporal sequences of cell states controlled by hierarchies of regulatory structures. It occurs over the lifetime of a single individual, and is associated to the gene expression level change of a given genotype. Evolution, by contrast entails genotypic change through the acquisition/loss of genes and changes in the network topology of interactions among genes. It involves the emergence of new, environmentally selected phenotypes over the lifetimes of many individuals. Here we present a model of regulatory network evolution that accounts for both timescales. We extend the framework of Boolean models of gene regulatory networks (GRN)-currently only applicable to describing development to include evolutionary processes. As opposed to one-to-one maps to specific attractors, we identify the phenotypes of the cells as the relevant macrostates of the GRN. A phenotype may now correspond to multiple attractors, and its formal definition no longer requires a fixed size for the genotype. This opens the possibility for a quantitative study of the phenotypic change of a genotype, which is itself changing over evolutionary timescales. We show how the realization of specific phenotypes can be controlled by gene duplication events (used here as an archetypal evolutionary event able to change the genotype), and how successive events of gene duplication lead to new regulatory structures via selection. At the same time, we show that our generalized framework does not inhibit network controllability and the possibility for network control theory to describe epigenetic signaling during development.