No Arabic abstract
Many proteins carry out their biological functions by forming the characteristic tertiary structures. Therefore, the search of the stable states of proteins by molecular simulations is important to understand their functions and stabilities. However, getting the stable state by conformational search is difficult, because the energy landscape of the system is characterized by many local minima separated by high energy barriers. In order to overcome this difficulty, various sampling and optimization methods for conformations of proteins have been proposed. In this study, we propose a new conformational search method for proteins by using genetic crossover and Metropolis criterion. We applied this method to an $alpha$-helical protein. The conformations obtained from the simulations are in good agreement with the experimental results.
We combined the genetic crossover, which is one of the operations of genetic algorithm, and replica-exchange method in parallel molecular dynamics simulations. The genetic crossover and replica-exchange method can search the global conformational space by exchanging the corresponding parts between a pair of conformations of a protein. In this study, we applied this method to an $alpha$-helical protein, Trp-cage mini protein, which has 20 amino-acid residues. The conformations obtained from the simulations are in good agreement with the experimental results.
Inverse statistical approaches to determine protein structure and function from Multiple Sequence Alignments (MSA) are emerging as powerful tools in computational biology. However the underlying assumptions of the relationship between the inferred effective Potts Hamiltonian and real protein structure and energetics remain untested so far. Here we use lattice protein model (LP) to benchmark those inverse statistical approaches. We build MSA of highly stable sequences in target LP structures, and infer the effective pairwise Potts Hamiltonians from those MSA. We find that inferred Potts Hamiltonians reproduce many important aspects of true LP structures and energetics. Careful analysis reveals that effective pairwise couplings in inferred Potts Hamiltonians depend not only on the energetics of the native structure but also on competing folds; in particular, the coupling values reflect both positive design (stabilization of native conformation) and negative design (destabilization of competing folds). In addition to providing detailed structural information, the inferred Potts models used as protein Hamiltonian for design of new sequences are able to generate with high probability completely new sequences with the desired folds, which is not possible using independent-site models. Those are remarkable results as the effective LP Hamiltonians used to generate MSA are not simple pairwise models due to the competition between the folds. Our findings elucidate the reasons for the success of inverse approaches to the modelling of proteins from sequence data, and their limitations.
Determining which proteins interact together is crucial to a systems-level understanding of the cell. Recently, algorithms based on Direct Coupling Analysis (DCA) pairwise maximum-entropy models have allowed to identify interaction partners among paralogous proteins from sequence data. This success of DCA at predicting protein-protein interactions could be mainly based on its known ability to identify pairs of residues that are in contact in the three-dimensional structure of protein complexes and that coevolve to remain physicochemically complementary. However, interacting proteins possess similar evolutionary histories. What is the role of purely phylogenetic correlations in the performance of DCA-based methods to infer interaction partners? To address this question, we employ controlled synthetic data that only involve phylogeny and no interactions or contacts. We find that DCA accurately identifies the pairs of synthetic sequences that share evolutionary history. While phylogenetic correlations confound the identification of contacting residues by DCA, they are thus useful to predict interacting partners among paralogs. We find that DCA performs as well as phylogenetic methods to this end, and slightly better than them with large and accurate training sets. Employing DCA or phylogenetic methods within an Iterative Pairing Algorithm (IPA) allows to predict pairs of evolutionary partners without a training set. We demonstrate the ability of these various methods to correctly predict pairings among real paralogous proteins with genome proximity but no known physical interaction, illustrating the importance of phylogenetic correlations in natural data. However, for physically interacting and strongly coevolving proteins, DCA and mutual information outperform phylogenetic methods. We discuss how to distinguish physically interacting proteins from those only sharing evolutionary history.
Cells use genetic switches to shift between alternate stable gene expression states, e.g., to adapt to new environments or to follow a developmental pathway. Conceptually, these stable phenotypes can be considered as attractive states on an epigenetic landscape with phenotypic changes being transitions between states. Measuring these transitions is challenging because they are both very rare in the absence of appropriate signals and very fast. As such, it has proven difficult to experimentally map the epigenetic landscapes that are widely believed to underly developmental networks. Here, we introduce a new nonequilibrium perturbation method to help reconstruct a regulatory networks epigenetic landscape. We derive the mathematical theory needed and then use the method on simulated data to reconstruct the landscapes. Our results show that with a relatively small number of perturbation experiments it is possible to recover an accurate representation of the true epigenetic landscape. We propose that our theory provides a general method by which epigenetic landscapes can be studied. Finally, our theory suggests that the total perturbation impulse required to induce a switch between metastable states is a fundamental quantity in developmental dynamics.
Elastic network models (ENMs) are valuable and efficient tools for characterizing the collective internal dynamics of proteins based on the knowledge of their native structures. The increasing evidence that the biological functionality of RNAs is often linked to their innate internal motions, poses the question of whether ENM approaches can be successfully extended to this class of biomolecules. This issue is tackled here by considering various families of elastic networks of increasing complexity applied to a representative set of RNAs. The fluctuations predicted by the alternative ENMs are stringently validated by comparison against extensive molecular dynamics simulations and SHAPE experiments. We find that simulations and experimental data are systematically best reproduced by either an all-atom or a three-beads-per-nucleotide representation (sugar-base-phosphate), with the latter arguably providing the best balance of accuracy and computational complexity.