No Arabic abstract
We propose a novel two-stage Gene Set Gibbs Sampling (GSGS) framework, to reverse engineer signaling pathways from gene sets inferred from molecular profiling data. We hypothesize that signaling pathways are structurally an ensemble of overlapping linear signal transduction events which we encode as Information Flow Gene Sets (IFGSs). We infer pathways from gene sets corresponding to these events subjected to a random permutation of genes within each set. In Stage I, we use a source separation algorithm to derive unordered and overlapping IFGSs from molecular profiling data, allowing cross talk among IFGSs. In Stage II, we develop a Gibbs sampling like algorithm, Gene Set Gibbs Sampler, to reconstruct signaling pathways from the latent IFGSs derived in Stage I. The novelty of this framework lies in the seamless integration of the two stages and the hypothesis of IFGSs as the basic building blocks for signal pathways. In the proof-of-concept studies, our approach is shown to outperform the existing Bayesian network approaches using both continuous and discrete data generated from benchmark networks in the DREAM initiative. We perform a comprehensive sensitivity analysis to assess the robustness of the approach. Finally, we implement the GSGS framework to reconstruct signaling pathways in breast cancer cells.
Signaling pathways serve to communicate information about extracellular conditions into the cell, to both the nucleus and cytoplasmic processes to control cell responses. Genetic mutations in signaling network components are frequently associated with cancer and can result in cells acquiring an ability to divide and grow uncontrollably. Because signaling pathways play such a significant role in cancer initiation and advancement, their constituent proteins are attractive therapeutic targets. In this review, we discuss how signaling pathway modeling can assist with identifying effective drugs for treating diseases, such as cancer. An achievement that would facilitate the use of such models is their ability to identify controlling biochemical parameters in signaling pathways, such as molecular abundances and chemical reaction rates, because this would help determine effective points of attack by therapeutics.
Complex biological functions are carried out by the interaction of genes and proteins. Uncovering the gene regulation network behind a function is one of the central themes in biology. Typically, it involves extensive experiments of genetics, biochemistry and molecular biology. In this paper, we show that much of the inference task can be accomplished by a deep neural network (DNN), a form of machine learning or artificial intelligence. Specifically, the DNN learns from the dynamics of the gene expression. The learnt DNN behaves like an accurate simulator of the system, on which one can perform in-silico experiments to reveal the underlying gene network. We demonstrate the method with two examples: biochemical adaptation and the gap-gene patterning in fruit fly embryogenesis. In the first example, the DNN can successfully find the two basic network motifs for adaptation - the negative feedback and the incoherent feed-forward. In the second and much more complex example, the DNN can accurately predict behaviors of essentially all the mutants. Furthermore, the regulation network it uncovers is strikingly similar to the one inferred from experiments. In doing so, we develop methods for deciphering the gene regulation network hidden in the DNN black box. Our interpretable DNN approach should have broad applications in genotype-phenotype mapping.
Modeling biological rhythms helps understand the complex principles behind the physical and psychological abnormalities of human bodies, to plan life schedules, and avoid persisting fatigue and mood and sleep alterations due to the desynchronization of those rhythms. The first step in modeling biological rhythms is to identify their characteristics, such as cyclic periods, phase, and amplitude. However, human rhythms are susceptible to external events, which cause irregular fluctuations in waveforms and affect the characterization of each rhythm. In this paper, we present our exploratory work towards developing a computational framework for automated discovery and modeling of human rhythms. We first identify cyclic periods in time series data using three different methods and test their performance on both synthetic data and real fine-grained biological data. We observe consistent periods are detected by all three methods. We then model inner cycles within each period through identifying change points to observe fluctuations in biological data that may inform the impact of external events on human rhythms. The results provide initial insights into the design of a computational framework for discovering and modeling human rhythms.
We develop a theoretical approach to the protein folding problem based on out-of-equilibrium stochastic dynamics. Within this framework, the computational difficulties related to the existence of large time scale gaps in the protein folding problem are removed and simulating the entire reaction in atomistic details using existing computers becomes feasible. In addition, this formalism provides a natural framework to investigate the relationships between thermodynamical and kinetic aspects of the folding. For example, it is possible to show that, in order to have a large probability to remain unchanged under Langevin diffusion, the native state has to be characterized by a small conformational entropy. We discuss how to determine the most probable folding pathway, to identify configurations representative of the transition state and to compute the most probable transition time. We perform an illustrative application of these ideas, studying the conformational evolution of alanine di-peptide, within an all-atom model based on the empiric GROMOS96 force field.
There are many mathematical models of biochemical cell signaling pathways that contain a large number of elements (species and reactions). This is sometimes a big issue for identifying critical model elements and describing the model dynamics. Thus, techniques of model reduction can be used as a mathematical tool in order to minimize the number of variables and parameters. In this thesis, we review some well-known methods of model reduction for cell signaling pathways. We have also developed some approaches that provide us a great step forward in model reduction. The techniques are quasi steady state approximation (QSSA), quasi equilibrium approximation (QEA), lumping of species and entropy production analysis. They are applied on protein translation pathways with microRNA mechanisms, chemical reaction networks, extracellular signal regulated kinase (ERK) pathways, NFkB signal transduction pathways, elongation factors EFTu and EFTs signaling pathways and Dihydrofolate reductase (DHFR) pathways. The main aim of this thesis is to reduce the complex cell signaling pathway models. This provides one a better understanding of the dynamics of such models and gives an accurate approximate solution. Results show that there is a good agreement between the original models and the simplified models.