No Arabic abstract
The tertiary structures of functional RNA molecules remain difficult to decipher. A new generation of automated RNA structure prediction methods may help address these challenges but have not yet been experimentally validated. Here we apply four prediction tools to a remarkable class of double glycine riboswitches that exhibit ligand-binding cooperativity. A novel method (BPPalign), RMdetect, JAR3D, and Rosetta 3D modeling give consistent predictions for a new stem P0 and kink-turn motif. These elements structure the linker between the RNAs double aptamers. Chemical mapping on the F. nucleatum riboswitch with SHAPE, DMS, and CMCT probing, mutate-and-map studies, and mutation/rescue experiments all provide strong evidence for the structured linker. Under solution conditions that separate two glycine binding transitions, disrupting this helix-junction-helix structure gives 120-fold and 6- to 30-fold poorer association constants for the two transitions, corresponding to an overall energetic impact of 4.3 pm 0.5 kcal/mol. Prior biochemical and crystallography studies from several labs did not include this critical element due to over-truncation of the RNA. We argue that several further undiscovered elements are likely to exist in the flanking regions of this and other RNA switches, and automated prediction tools can now play a powerful role in their detection and dissection.
RNA function crucially depends on its structure. Thermodynamic models currently used for secondary structure prediction rely on computing the partition function of folding ensembles, and can thus estimate minimum free-energy structures and ensemble populations. These models sometimes fail in identifying native structures unless complemented by auxiliary experimental data. Here, we build a set of models that combine thermodynamic parameters, chemical probing data (DMS, SHAPE), and co-evolutionary data (Direct Coupling Analysis, DCA) through a network that outputs perturbations to the ensemble free energy. Perturbations are trained to increase the ensemble populations of a representative set of known native RNA structures. In the chemical probing nodes of the network, a convolutional window combines neighboring reactivities, enlightening their structural information content and the contribution of local conformational ensembles. Regularization is used to limit overfitting and improve transferability. The most transferable model is selected through a cross-validation strategy that estimates the performance of models on systems on which they are not trained. With the selected model we obtain increased ensemble populations for native structures and more accurate predictions in an independent validation set. The flexibility of the approach allows the model to be easily retrained and adapted to incorporate arbitrary experimental information.
RNA is a fundamental class of biomolecules that mediate a large variety of molecular processes within the cell. Computational algorithms can be of great help in the understanding of RNA structure-function relationship. One of the main challenges in this field is the development of structure-prediction algorithms, which aim at the prediction of the three-dimensional (3D) native fold from the sole knowledge of the sequence. In a recent paper, we have introduced a scoring function for RNA structure prediction. Here, we analyze in detail the performance of the method, we underline strengths and shortcomings, and we discuss the results with respect to state-of-the-art techniques. These observations provide a starting point for improving current methodologies, thus paving the way to the advances of more accurate approaches for RNA 3D structure prediction.
For decades, dimethyl sulfate (DMS) mapping has informed manual modeling of RNA structure in vitro and in vivo. Here, we incorporate DMS data into automated secondary structure inference using a pseudo-energy framework developed for 2-OH acylation (SHAPE) mapping. On six non-coding RNAs with crystallographic models, DMS- guided modeling achieves overall false negative and false discovery rates of 9.5% and 11.6%, comparable or better than SHAPE-guided modeling; and non-parametric bootstrapping provides straightforward confidence estimates. Integrating DMS/SHAPE data and including CMCT reactivities give small additional improvements. These results establish DMS mapping - an already routine technique - as a quantitative tool for unbiased RNA structure modeling.
Three-dimensional RNA models fitted into crystallographic density maps exhibit pervasive conformational ambiguities, geometric errors and steric clashes. To address these problems, we present enumerative real-space refinement assisted by electron density under Rosetta (ERRASER), coupled to Python-based hierarchical environment for integrated xtallography (PHENIX) diffraction-based refinement. On 24 data sets, ERRASER automatically corrects the majority of MolProbity-assessed errors, improves the average Rfree factor, resolves functionally important discrepancies in noncanonical structure and refines low-resolution models to better match higher-resolution models.
Chemical purity of RNA samples is critical for high-precision studies of RNA folding and catalytic behavior, but such purity may be compromised by photodamage accrued during ultraviolet (UV) visualization of gel-purified samples. Here, we quantitatively assess the breadth and extent of such damage by using reverse transcription followed by single-nucleotide-resolution capillary electrophoresis. We detected UV-induced lesions across a dozen natural and artificial RNAs including riboswitch domains, other non-coding RNAs, and artificial sequences; across multiple sequence contexts, dominantly at but not limited to pyrimidine doublets; and from multiple lamps that are recommended for UV shadowing in the literature. Most strikingly, irradiation time-courses reveal detectable damage within a few seconds of exposure, and these data can be quantitatively fit to a skin effect model that accounts for the increased exposure of molecules near the top of irradiated gel slices. The results indicate that 200-nucleotide RNAs subjected to 20 seconds or less of UV shadowing can incur damage to 20% of molecules, and the molecule-by-molecule distribution of these lesions is more heterogeneous than a Poisson distribution. Photodamage from UV shadowing is thus likely a widespread but unappreciated cause of artifactual heterogeneity in quantitative and single-molecule-resolution RNA biophysical measurements.