أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Francesca Cuturello

Machine learning a model for RNA structure prediction

89 - Nicola Calonaci , Alisha Jones , Francesca Cuturello 2020

RNA function crucially depends on its structure. Thermodynamic models currently used for secondary structure prediction rely on computing the partition function of folding ensembles, and can thus estimate minimum free-energy structures and ensemble p opulations. These models sometimes fail in identifying native structures unless complemented by auxiliary experimental data. Here, we build a set of models that combine thermodynamic parameters, chemical probing data (DMS, SHAPE), and co-evolutionary data (Direct Coupling Analysis, DCA) through a network that outputs perturbations to the ensemble free energy. Perturbations are trained to increase the ensemble populations of a representative set of known native RNA structures. In the chemical probing nodes of the network, a convolutional window combines neighboring reactivities, enlightening their structural information content and the contribution of local conformational ensembles. Regularization is used to limit overfitting and improve transferability. The most transferable model is selected through a cross-validation strategy that estimates the performance of models on systems on which they are not trained. With the selected model we obtain increased ensemble populations for native structures and more accurate predictions in an independent validation set. The flexibility of the approach allows the model to be easily retrained and adapted to incorporate arbitrary experimental information.

الجزيئات الحيوية الفيزياء البيولوجية تحليل البيانات والإحصاءات والاحتمال

Assessing the accuracy of direct-coupling analysis for RNA contact prediction

217 - Francesca Cuturello , Guido Tiana , Giovanni Bussi 2018

Many non-coding RNAs are known to play a role in the cell directly linked to their structure. Structure prediction based on the sole sequence is however a challenging task. On the other hand, thanks to the low cost of sequencing technologies, a very large number of homologous sequences are becoming available for many RNA families. In the protein community, it has emerged in the last decade the idea of exploiting the covariance of mutations within a family to predict the protein structure using the direct-coupling-analysis (DCA) method. The application of DCA to RNA systems has been limited so far. We here perform an assessment of the DCA method on 17 riboswitch families, comparing it with the commonly used mutual information analysis and with state-of-the-art R-scape covariance method. We also compare different flavors of DCA, including mean-field, pseudo-likelihood, and a proposed stochastic procedure (Boltzmann learning) for solving exactly the DCA inverse problem. Boltzmann learning outperforms the other methods in predicting contacts observed in high resolution crystal structures.

الأساليب الكمية الميكانيكا الإحصائية الفيزياء البيولوجية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد