We describe the results obtained from an improved model for protein folding. We find that a good agreement with the native structure of a 46 residue long, five-letter protein segment is obtained by carefully tuning the parameters of the self-avoiding energy. In particular we find an improved free-energy profile. We also compare the efficiency of the multidimensional replica exchange method with the widely used parallel tempering.
Significant progress in computer hardware and software have enabled molecular dynamics (MD) simulations to model complex biological phenomena such as protein folding. However, enabling MD simulations to access biologically relevant timescales (e.g., beyond milliseconds) still remains challenging. These limitations include (1) quantifying which set of states have already been (sufficiently) sampled in an ensemble of MD runs, and (2) identifying novel states from which simulations can be initiated to sample rare events (e.g., sampling folding events). With the recent success of deep learning and artificial intelligence techniques in analyzing large datasets, we posit that these techniques can also be used to adaptively guide MD simulations to model such complex biological phenomena. Leveraging our recently developed unsupervised deep learning technique to cluster protein folding trajectories into partially folded intermediates, we build an iterative workflow that enables our generative model to be coupled with all-atom MD simulations to fold small protein systems on emerging high performance computing platforms. We demonstrate our approach in folding Fs-peptide and the $betabetaalpha$ (BBA) fold, FSD-EY. Our adaptive workflow enables us to achieve an overall root-mean squared deviation (RMSD) to the native state of 1.6$~AA$ and 4.4~$AA$ respectively for Fs-peptide and FSD-EY. We also highlight some emerging challenges in the context of designing scalable workflows when data intensive deep learning techniques are coupled to compute intensive MD simulations.
The folding pathway and rate coefficients of the folding of a knotted protein are calculated for a potential energy function with minimal energetic frustration. A kinetic transition network is constructed using the discrete path sampling approach, and the resulting potential energy surface is visualized by constructing disconnectivity graphs. Owing to topological constraints, the low-lying portion of the landscape consists of three distinct regions, corresponding to the native knotted state and to configurations where either the N- or C-terminus is not yet folded into the knot. The fastest folding pathways from denatured states exhibit early formation of the N-terminus portion of the knot and a rate-determining step where the C-terminus is incorporated. The low-lying minima with the N-terminus knotted and the C-terminus free therefore constitute an off-pathway intermediate for this model. The insertion of both the N- and C-termini into the knot occur late in the folding process, creating large energy barriers that are the rate limiting steps in the folding process. When compared to other protein folding proteins of a similar length, this system folds over six orders of magnitude more slowly.
Energy landscape theory describes how a full-length protein can attain its native fold after sampling only a tiny fraction of all possible structures. Although protein folding is now understood to be concomitant with synthesis on the ribosome there have been few attempts to modify energy landscape theory by accounting for cotranslational folding. This paper introduces a model for cotranslational folding that leads to a natural definition of a nested energy landscape. By applying concepts drawn from submanifold differential geometry the dynamics of protein folding on the ribosome can be explored in a quantitative manner and conditions on the nested potential energy landscapes for a good cotranslational folder are obtained. A generalisation of diffusion rate theory using van Kampens technique of composite stochastic processes is then used to account for entropic contributions and the effects of variable translation rates on cotranslational folding. This stochastic approach agrees well with experimental results and Hamiltionian formalism in the deterministic limit.
Models of protein energetics which neglect interactions between amino acids that are not adjacent in the native state, such as the Go model, encode or underlie many influential ideas on protein folding. Implicit in this simplification is a crucial assumption that has never been critically evaluated in a broad context: Detailed mechanisms of protein folding are not biased by non-native contacts, typically imagined as a consequence of sequence design and/or topology. Here we present, using computer simulations of a well-studied lattice heteropolymer model, the first systematic test of this oft-assumed correspondence over the statistically significant range of hundreds of thousands of amino acid sequences, and a concomitantly diverse set of folding pathways. Enabled by a novel means of fingerprinting folding trajectories, our study reveals a profound insensitivity of the order in which native contacts accumulate to the omission of non-native interactions. Contrary to conventional thinking, this robustness does not arise from topological restrictions and does not depend on folding rate. We find instead that the crucial factor in discriminating among topological pathways is the heterogeneity of native contact energies. Our results challenge conventional thinking on the relationship between sequence design and free energy landscapes for protein folding, and help justify the widespread use of Go-like models to scrutinize detailed folding mechanisms of real proteins.
Understanding protein folding has been one of the great challenges in biochemistry and molecular biophysics. Over the past 50 years, many thermodynamic and kinetic studies have been performed addressing the stability of globular proteins. In comparison, advances in the membrane protein folding field lag far behind. Although membrane proteins constitute about a third of the proteins encoded in known genomes, stability studies on membrane proteins have been impaired due to experimental limitations. Furthermore, no systematic experimental strategies are available for folding these biomolecules in vitro. Common denaturing agents such as chaotropes usually do not work on helical membrane proteins, and ionic detergents have been successful denaturants only in few cases. Refolding a membrane protein seems to be a craftsman work, which is relatively straightforward for transmembrane {beta}-barrel proteins but challenging for {alpha}-helical membrane proteins. Additional complexities emerge in multidomain membrane proteins, data interpretation being one of the most critical. In this review, we will describe some recent efforts in understanding the folding mechanism of membrane proteins that have been reversibly refolded allowing both thermodynamic and kinetic analysis. This information will be discussed in the context of current paradigms in the protein folding field.