No Arabic abstract
In this perspective article, we present a multidisciplinary approach for characterizing protein structure networks. We first place our approach in its historical context and describe the manner in which it synthesizes concepts from quantum chemistry, biology of polymer conformations, matrix mathematics, and percolation theory. We then explicitly provide the method for constructing the protein structure network in terms of non-covalently interacting amino acid side chains and show how a mine of information can be obtained from the graph spectra of these networks. Employing suitable mathematical approaches, such as the use of a weighted, Laplacian matrix to generate the spectra, enables us to develop rigorous methods for network comparison and to identify crucial nodes responsible for the network integrity through a perturbation approach. Our scoring methods have several applications in structural biology that are elusive to conventional methods of analyses. Here, we discuss the instances of: (a) Protein structure comparison that include the details of side chain connectivity, (b) The contribution to node clustering as a function of bound ligand, explaining the global effect of local changes in phenomena such as allostery and (c) The identification of crucial amino acids for structural integrity, derived purely from the spectra of the graph. We demonstrate how our method enables us to obtain valuable information on key proteins involved in cellular functions and diseases such as GPCR and HIV protease, and discuss the biological implications. We then briefly describe how concepts from percolation theory further augment our analyses. In our concluding perspective for future developments, we suggest a further unifying approach to protein structure analyses and a judicious choice of questions to employ our methods for larger, more complex networks, such as metabolic and disease networks.
Many unicellular organisms allocate their key proteins asymmetrically between the mother and daughter cells, especially in a stressed environment. A recent theoretical model is able to predict when the asymmetry in segregation of key proteins enhances the population fitness, extrapolating the solution at two limits where the segregation is perfectly asymmetric (asymmetry $a$ = 1) and when the asymmetry is small ($0 leq a ll 1$). We generalize the model by introducing stochasticity and use a transport equation to obtain a self-consistent equation for the population growth rate and the distribution of the amount of key proteins. We provide two ways of solving the self-consistent equation: numerically by updating the solution for the self-consistent equation iteratively and analytically by expanding moments of the distribution. With these more powerful tools, we can extend the previous model by Lin et al. to include stochasticity to the segregation asymmetry. We show the stochastic model is equivalent to the deterministic one with a modified effective asymmetry parameter ($a_{rm eff}$). We discuss the biological implication of our models and compare with other theoretical models.
We study a statistical model describing the steady state distribution of the fluxes in a metabolic network. The resulting model on continuous variables can be solved by the cavity method. In particular analytical tractability is possible solving the cavity equation over an ensemble of networks with the same degree distribution of the real metabolic network. The flux distribution that optimizes production of biomass has a fat tail with a power-law exponent independent on the structural properties of the underling network. These results are in complete agreement with the Flux-Balance-Analysis outcome of the same system and in qualitative agreement with the experimental results.
Coarse-graining is a powerful tool for extending the reach of dynamic models of proteins and other biological macromolecules. Topological coarse-graining, in which biomolecules or sets thereof are represented via graph structures, is a particularly useful way of obtaining highly compressed representations of molecular structure, and simulations operating via such representations can achieve substantial computational savings. A drawback of coarse-graining, however, is the loss of atomistic detail - an effect that is especially acute for topological representations such as protein structure networks (PSNs). Here, we introduce an approach based on a combination of machine learning and physically-guided refinement for inferring atomic coordinates from PSNs. This neural upscaling procedure exploits the constraints implied by PSNs on possible configurations, as well as differences in the likelihood of observing different configurations with the same PSN. Using a 1 $mu$s atomistic molecular dynamics trajectory of A$beta_{1-40}$, we show that neural upscaling is able to effectively recapitulate detailed structural information for intrinsically disordered proteins, being particularly successful in recovering features such as transient secondary structure. These results suggest that scalable network-based models for protein structure and dynamics may be used in settings where atomistic detail is desired, with upscaling employed to impute atomic coordinates from PSNs.
Synthetic biology aims at designing modular genetic circuits that can be assembled according to the desired function. When embedded in a cell, a circuit module becomes a small subnetwork within a larger environmental network, and its dynamics is therefore affected by potentially unknown interactions with the environment. It is well-known that the presence of the environment not only causes extrinsic noise but also memory effects, which means that the dynamics of the subnetwork is affected by its past states via a memory function that is characteristic of the environment. We study several generic scenarios for the coupling between a small module and a larger environment, with the environment consisting of a chain of mono-molecular reactions. By mapping the dynamics of this coupled system onto random walks, we are able to give exact analytical expressions for the arising memory functions. Hence, our results give insights into the possible types of memory functions and thereby help to better predict subnetwork dynamics.
The common techniques to study protein-protein proximity in vivo are not well-adapted to the capabilities and the expertise of a standard proteomics laboratory, typically based on the use of mass spectrometry. With the aim of closing this gap, we have developed PUB-MS (for Proximity Utilizing Biotinylation and Mass Spectrometry), an approach to monitor protein-protein proximity, based on biotinylation of a protein fused to a biotin-acceptor peptide (BAP) by a biotin-ligase, BirA, fused to its interaction partner. The biotinylation status of the BAP can be further detected by either Western analysis or mass spectrometry. The BAP sequence was redesigned for easy monitoring of the biotinylation status by LC-MS/MS. In several experimental models, we demonstrate that the biotinylation in vivo is specifically enhanced when the BAP- and BirA- fused proteins are in proximity to each other. The advantage of mass spectrometry is demonstrated by using BAPs with different sequences in a single experiment (allowing multiplex analysis) and by the use of stable isotopes. Finally, we show that our methodology can be also used to study a specific subfraction of a protein of interest that was in proximity with another protein at a predefined time before the analysis.