ترغب بنشر مسار تعليمي؟ اضغط هنا

The block spectrum of RNA pseudoknot structures

77   0   0.0 ( 0 )
 نشر من قبل Thomas Li
 تاريخ النشر 2018
  مجال البحث
والبحث باللغة English




اسأل ChatGPT حول البحث

In this paper we analyze the length-spectrum of blocks in $gamma$-structures. $gamma$-structures are a class of RNA pseudoknot structures that plays a key role in the context of polynomial time RNA folding. A $gamma$-structure is constructed by nesting and concatenating specific building components having topological genus at most $gamma$. A block is a substructure enclosed by crossing maximal arcs with respect to the partial order induced by nesting. We show that, in uniformly generated $gamma$-structures, there is a significant gap in this length-spectrum, i.e., there asymptotically almost surely exists a unique longest block of length at least $n-O(n^{1/2})$ and that with high probability any other block has finite length. For fixed $gamma$, we prove that the length of the longest block converges to a discrete limit law, and that the distribution of short blocks of given length tends to a negative binomial distribution in the limit of long sequences. We refine this analysis to the length spectrum of blocks of specific pseudoknot types, such as H-type and kissing hairpins. Our results generalize the rainbow spectrum on secondary structures by the first and third authors and are being put into context with the structural prediction of long non-coding RNAs.

قيم البحث

اقرأ أيضاً

In this paper we analyze the length-spectrum of rainbows in RNA secondary structures. A rainbow in a secondary structure is a maximal arc with respect to the partial order induced by nesting. We show that there is a significant gap in this length-spe ctrum. We shall prove that there asymptotically almost surely exists a unique longest rainbow of length at least $n-O(n^{1/2})$ and that with high probability any other rainbow has finite length. We show that the distribution of the length of the longest rainbow converges to a discrete limit law and that, for finite $k$, the distribution of rainbows of length $k$, becomes for large $n$ a negative binomial distribution. We then put the results of this paper into context, comparing the analytical results with those observed in RNA minimum free energy structures, biological RNA structures and relate our findings to the sparsification of folding algorithms.
In this paper we study $k$-noncrossing, canonical RNA pseudoknot structures with minimum arc-length $ge 4$. Let ${sf T}_{k,sigma}^{[4]} (n)$ denote the number of these structures. We derive exact enumeration results by computing the generating functi on ${bf T}_{k,sigma}^{[4]}(z)= sum_n{sf T}_{k,sigma}^{[4]}(n)z^n$ and derive the asymptotic formulas ${sf T}_{k,3}^{[4]}(n)^{}sim c_k n^{-(k-1)^2-frac{k-1}{2}} (gamma_{k,3}^{[4]})^{-n}$ for $k=3,...,9$. In particular we have for $k=3$, ${sf T}_{3,3}^{[4]}(n)^{}sim c_3 n^{-5} 2.0348^n$. Our results prove that the set of biophysically relevant RNA pseudoknot structures is surprisingly small and suggest a new structure class as target for prediction algorithms.
In this paper we study the distribution of stacks in $k$-noncrossing, $tau$-canonical RNA pseudoknot structures ($<k,tau> $-structures). An RNA structure is called $k$-noncrossing if it has no more than $k-1$ mutually crossing arcs and $tau$-canonica l if each arc is contained in a stack of length at least $tau$. Based on the ordinary generating function of $<k,tau>$-structures cite{Reidys:08ma} we derive the bivariate generating function ${bf T}_{k,tau}(x,u)=sum_{n geq 0} sum_{0leq t leq frac{n}{2}} {sf T}_{k, tau}^{} (n,t) u^t x^n$, where ${sf T}_{k,tau}(n,t)$ is the number of $<k,tau>$-structures having exactly $t$ stacks and study its singularities. We show that for a certain parametrization of the variable $u$, ${bf T}_{k,tau}(x,u)$ has a unique, dominant singularity. The particular shift of this singularity parametrized by $u$ implies a central limit theorem for the distribution of stack-numbers. Our results are of importance for understanding the ``language of minimum-free energy RNA pseudoknot structures, generated by computer folding algorithms.
In this paper we present a selfcontained analysis and description of the novel {it ab initio} folding algorithm {sf cross}, which generates the minimum free energy (mfe), 3-noncrossing, $sigma$-canonical RNA structure. Here an RNA structure is 3-nonc rossing if it does not contain more than three mutually crossing arcs and $sigma$-canonical, if each of its stacks has size greater or equal than $sigma$. Our notion of mfe-structure is based on a specific concept of pseudoknots and respective loop-based energy parameters. The algorithm decomposes into three parts: the first is the inductive construction of motifs and shadows, the second is the generation of the skeleta-trees rooted in irreducible shadows and the third is the saturation of skeleta via context dependent dynamic programming routines.
In this paper we study $k$-noncrossing RNA structures with minimum arc-length 4 and at most $k-1$ mutually crossing bonds. Let ${sf T}_{k}^{[4]}(n)$ denote the number of $k$-noncrossing RNA structures with arc-length $ge 4$ over $n$ vertices. We prov e (a) a functional equation for the generating function $sum_{nge 0}{sf T}_{k}^{[4]}(n)z^n$ and (b) derive for $kle 9$ the asymptotic formula ${sf T}_{k}^{[4]}(n)sim c_k n^{-((k-1)^2+(k-1)/2)} gamma_k^{-n}$. Furthermore we explicitly compute the exponential growth rates $gamma_k^{-1}$ and asymptotic formulas for $4le kle 9$.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا