No Arabic abstract
In spite of decades of research, much remains to be discovered about folding: the detailed structure of the initial (unfolded) state, vestigial folding instructions remaining only in the unfolded state, the interaction of the molecule with the solvent, instantaneous power at each point within the molecule during folding, the fact that the process is stable in spite of myriad possible disturbances, potential stabilization of trajectory by chaos, and, of course, the exact physical mechanism (code or instructions) by which the folding process is specified in the amino acid sequence. Simulations based upon microscopic physics have had some spectacular successes and continue to improve, particularly as super-computer capabilities increase. The simulations, exciting as they are, are still too slow and expensive to deal with the enormous number of molecules of interest. In this paper, we introduce an approximate model based upon physics, empirics, and information science which is proposed for use in machine learning applications in which very large numbers of sub-simulations must be made. In particular, we focus upon machine learning applications in the learning phase and argue that our model is sufficiently close to the physics that, in spite of its approximate nature, can facilitate stepping through machine learning solutions to explore the mechanics of folding mentioned above. We particularly emphasize the exploration of energy flow (power) within the molecule during folding, the possibility of energy scale invariance (above a threshold), vestigial information in the unfolded state as attractive targets for such machine language analysis, and statistical analysis of an ensemble of folding micro-steps.
A model to describe the mechanism of conformational dynamics in secondary protein based on matter interactions is proposed. The approach deploys the lagrangian method by imposing certain symmetry breaking. The protein backbone is initially assumed to be nonlinear and represented by the Sine-Gordon equation, while the nonlinear external bosonic sources is represented by $phi^4$ interaction. It is argued that the nonlinear source induces the folding pathway in a different way than the previous work with initially linear backbone. Also, the nonlinearity of protein backbone decreases the folding speed.
Processes that proceed reliably from a variety of initial conditions to a unique final form, regardless of moderately changing conditions, are of obvious importance in biophysics. Protein folding is a case in point. We show that the action principle can be applied directly to study the stability of biological processes. The action principle in classical physics starts with the first variation of the action and leads immediately to the equations of motion. The second variation of the action leads in a natural way to powerful theorems that provide quantitative treatment of stability and focusing and also explain how some very complex processes can behave as though some seemingly important forces drop out. We first apply these ideas to the non-equilibrium states involved in two-state folding. We treat torsional waves and use the action principle to talk about critical points in the dynamics. For some proteins the theory resembles TST. We reach several quantitative and qualitative conclusions. Besides giving an explanation of why TST often works in folding, we find that the apparent smoothness of the energy funnel is a natural consequence of the putative critical points in the dynamics. These ideas also explain why biological proteins fold to unique states and random polymers do not. The insensitivity to perturbations which follows from the presence of critical points explains how folding to a unique shape occurs in the presence of dilute denaturing agents in spite of the fact that those agents disrupt the folded structure of the native state. This paper contributes to the theoretical armamentarium by directing attention to the logical progression from first physical principles to the stability theorems related to catastrophe theory as applied to folding. This can potentially have the same success in biophysics as it has enjoyed in optics.
A model to describe the mechanism of conformational dynamics in protein based on matter interactions using lagrangian approach and imposing certain symmetry breaking is proposed. Both conformation changes of proteins and the injected non-linear sources are represented by the bosonic lagrangian with an additional phi^4 interaction for the sources. In the model the spring tension of protein representing the internal hydrogen bonds is realized as the interactions between individual amino acids and nonlinear sources. The folding pathway is determined by the strength of nonlinear sources that propagate through the protein backbone. It is also shown that the model reproduces the results in some previous works.
Specific protein-protein interactions are crucial in the cell, both to ensure the formation and stability of multi-protein complexes, and to enable signal transduction in various pathways. Functional interactions between proteins result in coevolution between the interaction partners, causing their sequences to be correlated. Here we exploit these correlations to accurately identify which proteins are specific interaction partners from sequence data alone. Our general approach, which employs a pairwise maximum entropy model to infer couplings between residues, has been successfully used to predict the three-dimensional structures of proteins from sequences. Thus inspired, we introduce an iterative algorithm to predict specific interaction partners from two protein families whose members are known to interact. We first assess the algorithms performance on histidine kinases and response regulators from bacterial two-component signaling systems. We obtain a striking 0.93 true positive fraction on our complete dataset without any a priori knowledge of interaction partners, and we uncover the origin of this success. We then apply the algorithm to proteins from ATP-binding cassette (ABC) transporter complexes, and obtain accurate predictions in these systems as well. Finally, we present two metrics that accurately distinguish interacting protein families from non-interacting ones, using only sequence data.