No Arabic abstract
Energy landscape theory describes how a full-length protein can attain its native fold after sampling only a tiny fraction of all possible structures. Although protein folding is now understood to be concomitant with synthesis on the ribosome there have been few attempts to modify energy landscape theory by accounting for cotranslational folding. This paper introduces a model for cotranslational folding that leads to a natural definition of a nested energy landscape. By applying concepts drawn from submanifold differential geometry the dynamics of protein folding on the ribosome can be explored in a quantitative manner and conditions on the nested potential energy landscapes for a good cotranslational folder are obtained. A generalisation of diffusion rate theory using van Kampens technique of composite stochastic processes is then used to account for entropic contributions and the effects of variable translation rates on cotranslational folding. This stochastic approach agrees well with experimental results and Hamiltionian formalism in the deterministic limit.
The folding pathway and rate coefficients of the folding of a knotted protein are calculated for a potential energy function with minimal energetic frustration. A kinetic transition network is constructed using the discrete path sampling approach, and the resulting potential energy surface is visualized by constructing disconnectivity graphs. Owing to topological constraints, the low-lying portion of the landscape consists of three distinct regions, corresponding to the native knotted state and to configurations where either the N- or C-terminus is not yet folded into the knot. The fastest folding pathways from denatured states exhibit early formation of the N-terminus portion of the knot and a rate-determining step where the C-terminus is incorporated. The low-lying minima with the N-terminus knotted and the C-terminus free therefore constitute an off-pathway intermediate for this model. The insertion of both the N- and C-termini into the knot occur late in the folding process, creating large energy barriers that are the rate limiting steps in the folding process. When compared to other protein folding proteins of a similar length, this system folds over six orders of magnitude more slowly.
The intricate three-dimensional geometries of protein tertiary structures underlie protein function and emerge through a folding process from one-dimensional chains of amino acids. The exact spatial sequence and configuration of amino acids, the biochemical environment and the temporal sequence of distinct interactions yield a complex folding process that cannot yet be easily tracked for all proteins. To gain qualitative insights into the fundamental mechanisms behind the folding dynamics and generic features of the folded structure, we propose a simple model of structure formation that takes into account only fundamental geometric constraints and otherwise assumes randomly paired connections. We find that despite its simplicity, the model results in a network ensemble consistent with key overall features of the ensemble of Protein Residue Networks we obtained from more than 1000 biological protein geometries as available through the Protein Data Base. Specifically, the distribution of the number of interaction neighbors a unit (amino acid) has, the scaling of the structures spatial extent with chain length, the eigenvalue spectrum and the scaling of the smallest relaxation time with chain length are all consistent between model and real proteins. These results indicate that geometric constraints alone may already account for a number of generic features of protein tertiary structures.
We report the folding thermodynamics of ccUUCGgg and ccGAGAgg RNA tetraloops using atomistic molecular dynamics simulations. We obtain a previously unreported estimation of the folding free energy using parallel tempering in combination with well-tempered metadynamics. A key ingredient is the use of a recently developed metric distance, eRMSD, as a biased collective variable. We find that the native fold of both tetraloops is not the global free energy minimum using the Amberc{hi}OL3 force field. The estimated folding free energies are 30.2kJ/mol for UUCG and 7.5 kJ/mol for GAGA, in striking disagreement with experimental data. We evaluate the viability of all possible one-dimensional backbone force field corrections. We find that disfavoring the gauche+ region of {alpha} and {zeta} angles consistently improves the existing force field. The level of accuracy achieved with these corrections, however, cannot be considered sufficient by judging on the basis of available thermodynamic data and solution experiments.
Significant progress in computer hardware and software have enabled molecular dynamics (MD) simulations to model complex biological phenomena such as protein folding. However, enabling MD simulations to access biologically relevant timescales (e.g., beyond milliseconds) still remains challenging. These limitations include (1) quantifying which set of states have already been (sufficiently) sampled in an ensemble of MD runs, and (2) identifying novel states from which simulations can be initiated to sample rare events (e.g., sampling folding events). With the recent success of deep learning and artificial intelligence techniques in analyzing large datasets, we posit that these techniques can also be used to adaptively guide MD simulations to model such complex biological phenomena. Leveraging our recently developed unsupervised deep learning technique to cluster protein folding trajectories into partially folded intermediates, we build an iterative workflow that enables our generative model to be coupled with all-atom MD simulations to fold small protein systems on emerging high performance computing platforms. We demonstrate our approach in folding Fs-peptide and the $betabetaalpha$ (BBA) fold, FSD-EY. Our adaptive workflow enables us to achieve an overall root-mean squared deviation (RMSD) to the native state of 1.6$~AA$ and 4.4~$AA$ respectively for Fs-peptide and FSD-EY. We also highlight some emerging challenges in the context of designing scalable workflows when data intensive deep learning techniques are coupled to compute intensive MD simulations.
Understanding protein folding has been one of the great challenges in biochemistry and molecular biophysics. Over the past 50 years, many thermodynamic and kinetic studies have been performed addressing the stability of globular proteins. In comparison, advances in the membrane protein folding field lag far behind. Although membrane proteins constitute about a third of the proteins encoded in known genomes, stability studies on membrane proteins have been impaired due to experimental limitations. Furthermore, no systematic experimental strategies are available for folding these biomolecules in vitro. Common denaturing agents such as chaotropes usually do not work on helical membrane proteins, and ionic detergents have been successful denaturants only in few cases. Refolding a membrane protein seems to be a craftsman work, which is relatively straightforward for transmembrane {beta}-barrel proteins but challenging for {alpha}-helical membrane proteins. Additional complexities emerge in multidomain membrane proteins, data interpretation being one of the most critical. In this review, we will describe some recent efforts in understanding the folding mechanism of membrane proteins that have been reversibly refolded allowing both thermodynamic and kinetic analysis. This information will be discussed in the context of current paradigms in the protein folding field.