Do you want to publish a course? Click here

A new distance for high level RNA secondary structure comparison

96   0   0.0 ( 0 )
 Added by Laetitia Omer
 Publication date 2008
and research's language is English
 Authors Julien Allali




Ask ChatGPT about the research

We describe an algorithm for comparing two RNA secondary structures coded in the form of trees that introduces two new operations, called node fusion and edge fusion, besides the tree edit operations of deletion, insertion, and relabeling classically used in the literature. This allows us to address some serious limitations of the more traditional tree edit operations when the trees represent RNAs and what is searched for is a common structural core of two RNAs. Although the algorithm complexity has an exponential term, this term depends only on the number of successive fusions that may be applied to a same node, not on the total number of fusions. The algorithm remains therefore efficient in practice and is used for illustrative purposes on ribosomal as well as on other types of RNAs.



rate research

Read More

An RNA sequence is a word over an alphabet on four elements ${A,C,G,U}$ called bases. RNA sequences fold into secondary structures where some bases match one another while others remain unpaired. Pseudoknot-free secondary structures can be represented as well-parenthesized expressions with additional dots, where pairs of matching parentheses symbolize paired bases and dots, unpaired bases. The two fundamental problems in RNA algorithmic are to predict how sequences fold within some model of energy and to design sequences of bases which will fold into targeted secondary structures. Predicting how a given RNA sequence folds into a pseudoknot-free secondary structure is known to be solvable in cubic time since the eighties and in truly subcubic time by a recent result of Bringmann et al. (FOCS 2016). As a stark contrast, it is unknown whether or not designing a given RNA secondary structure is a tractable task; this has been raised as a challenging open question by Anne Condon (ICALP 2003). Because of its crucial importance in a number of fields such as pharmaceutical research and biochemistry, there are dozens of heuristics and software libraries dedicated to RNA secondary structure design. It is therefore rather surprising that the computational complexity of this central problem in bioinformatics has been unsettled for decades. In this paper we show that, in the simplest model of energy which is the Watson-Crick model the design of secondary structures is NP-complete if one adds natural constraints of the form: index $i$ of the sequence has to be labeled by base $b$. This negative result suggests that the same lower bound holds for more realistic models of energy. It is noteworthy that the additional constraints are by no means artificial: they are provided by all the RNA design pieces of software and they do correspond to the actual practice.
The tertiary structures of functional RNA molecules remain difficult to decipher. A new generation of automated RNA structure prediction methods may help address these challenges but have not yet been experimentally validated. Here we apply four prediction tools to a remarkable class of double glycine riboswitches that exhibit ligand-binding cooperativity. A novel method (BPPalign), RMdetect, JAR3D, and Rosetta 3D modeling give consistent predictions for a new stem P0 and kink-turn motif. These elements structure the linker between the RNAs double aptamers. Chemical mapping on the F. nucleatum riboswitch with SHAPE, DMS, and CMCT probing, mutate-and-map studies, and mutation/rescue experiments all provide strong evidence for the structured linker. Under solution conditions that separate two glycine binding transitions, disrupting this helix-junction-helix structure gives 120-fold and 6- to 30-fold poorer association constants for the two transitions, corresponding to an overall energetic impact of 4.3 pm 0.5 kcal/mol. Prior biochemical and crystallography studies from several labs did not include this critical element due to over-truncation of the RNA. We argue that several further undiscovered elements are likely to exist in the flanking regions of this and other RNA switches, and automated prediction tools can now play a powerful role in their detection and dissection.
For decades, dimethyl sulfate (DMS) mapping has informed manual modeling of RNA structure in vitro and in vivo. Here, we incorporate DMS data into automated secondary structure inference using a pseudo-energy framework developed for 2-OH acylation (SHAPE) mapping. On six non-coding RNAs with crystallographic models, DMS- guided modeling achieves overall false negative and false discovery rates of 9.5% and 11.6%, comparable or better than SHAPE-guided modeling; and non-parametric bootstrapping provides straightforward confidence estimates. Integrating DMS/SHAPE data and including CMCT reactivities give small additional improvements. These results establish DMS mapping - an already routine technique - as a quantitative tool for unbiased RNA structure modeling.
113 - Qi Zhao , Zheng Zhao , Xiaoya Fan 2020
Secondary structure plays an important role in determining the function of non-coding RNAs. Hence, identifying RNA secondary structures is of great value to research. Computational prediction is a mainstream approach for predicting RNA secondary structure. Unfortunately, even though new methods have been proposed over the past 40 years, the performance of computational prediction methods has stagnated in the last decade. Recently, with the increasing availability of RNA structure data, new methods based on machine-learning technologies, especially deep learning, have alleviated the issue. In this review, we provide a comprehensive overview of RNA secondary structure prediction methods based on machine-learning technologies and a tabularized summary of the most important methods in this field. The current pending issues in the field of RNA secondary structure prediction and future trends are also discussed.
Given a random RNA secondary structure, $S$, we study RNA sequences having fixed ratios of nuclotides that are compatible with $S$. We perform this analysis for RNA secondary structures subject to various base pairing rules and minimum arc- and stack-length restrictions. Our main result reads as follows: in the simplex of the nucleotide ratios there exists a convex region in which, in the limit of long sequences, a random structure a.a.s.~has compatible sequence with these ratios and outside of which a.a.s.~a random structure has no such compatible sequence. We localize this region for RNA secondary structures subject to various base pairing rules and minimum arc- and stack-length restrictions. In particular, for {bf GC}-sequences having a ratio of {bf G} nucleotides smaller than $1/3$, a random RNA secondary structure without any minimum arc- and stack-length restrictions has a.a.s.~no such compatible sequence. For sequences having a ratio of {bf G} nucleotides larger than $1/3$, a random RNA secondary structure has a.a.s. such compatible sequences. We discuss our results in the context of various families of RNA structures.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا