ترغب بنشر مسار تعليمي؟ اضغط هنا

An Upper Bound on the Convergence Rate of a Second Functional in Optimal Sequence Alignment

126   0   0.0 ( 0 )
 نشر من قبل Ionel Popescu
 تاريخ النشر 2014
  مجال البحث
والبحث باللغة English




اسأل ChatGPT حول البحث

Consider finite sequences $X_{[1,n]}=X_1dots X_n$ and $Y_{[1,n]}=Y_1dots Y_n$ of length $n$, consisting of i.i.d. samples of random letters from a finite alphabet, and let $S$ and $T$ be chosen i.i.d. randomly from the unit ball in the space of symmetric scoring functions over this alphabet augmented by a gap symbol. We prove a probabilistic upper bound of linear order in $n^{0.75}$ for the deviation of the score relative to $T$ of optimal alignments with gaps of $X_{[1,n]}$ and $Y_{[1,n]}$ relative to $S$. It remains an open problem to prove a lower bound. Our result contributes to the understanding of the microstructure of optimal alignments relative to one given scoring function, extending a theory begun by the first two authors.



قيم البحث

اقرأ أيضاً

128 - Yujie Zhao , Xiaoming Huo 2020
In optimization, it is known that when the objective functions are strictly convex and well-conditioned, gradient based approaches can be extremely effective, e.g., achieving the exponential rate in convergence. On the other hand, the existing Lasso- type of estimator in general cannot achieve the optimal rate due to the undesirable behavior of the absolute function at the origin. A homotopic method is to use a sequence of surrogate functions to approximate the $ell_1$ penalty that is used in the Lasso-type of estimators. The surrogate functions will converge to the $ell_1$ penalty in the Lasso estimator. At the same time, each surrogate function is strictly convex, which enables provable faster numerical rate of convergence. In this paper, we demonstrate that by meticulously defining the surrogate functions, one can prove faster numerical convergence rate than any existing methods in computing for the Lasso-type of estimators. Namely, the state-of-the-art algorithms can only guarantee $O(1/epsilon)$ or $O(1/sqrt{epsilon})$ convergence rates, while we can prove an $O([log(1/epsilon)]^2)$ for the newly proposed algorithm. Our numerical simulations show that the new algorithm also performs better empirically.
In this article, we try to give an answer to the simple question: ``textit{What is the critical growth rate of the dimension $p$ as a function of the sample size $n$ for which the Central Limit Theorem holds uniformly over the collection of $p$-dimen sional hyper-rectangles ?}. Specifically, we are interested in the normal approximation of suitably scal
128 - Daniel Ahlberg 2014
We study the rate of convergence in the Shape Theorem of first-passage percolation, obtaining the precise asymptotic rate of decay for the probability of linear order deviations under a moment condition. Our results are stated for a given time and co mplements recent work by the same author, in which the rate of convergence was studied from the standard spatial perspective.
108 - Ilia Krasikov 2006
Let ${bf P}_k^{(alpha, beta)} (x)$ be an orthonormal Jacobi polynomial of degree $k.$ We will establish the following inequality begin{equation*} max_{x in [delta_{-1},delta_1]}sqrt{(x- delta_{-1})(delta_1-x)} (1-x)^{alpha}(1+x)^{beta} ({bf P}_{k}^{( alpha, beta)} (x))^2 < frac{3 sqrt{5}}{5}, end{equation*} where $delta_{-1}<delta_1$ are appropriate approximations to the extreme zeros of ${bf P}_k^{(alpha, beta)} (x) .$ As a corollary we confirm, even in a stronger form, T. Erd{e}lyi, A.P. Magnus and P. Nevai conjecture [Erd{e}lyi et al., Generalized Jacobi weights, Christoffel functions, and Jacobi polynomials, SIAM J. Math. Anal. 25 (1994), 602-614], by proving that begin{equation*} max_{x in [-1,1]}(1-x)^{alpha+{1/2}}(1+x)^{beta+{1/2}}({bf P}_k^{(alpha, beta)} (x))^2 < 3 alpha^{1/3} (1+ frac{alpha}{k})^{1/6}, end{equation*} in the region $k ge 6, alpha, beta ge frac{1+ sqrt{2}}{4}.$
Pairwise alignment of DNA sequencing data is a ubiquitous task in bioinformatics and typically represents a heavy computational burden. A standard approach to speed up this task is to compute sketches of the DNA reads (typically via hashing-based tec hniques) that allow the efficient computation of pairwise alignment scores. We propose a rate-distortion framework to study the problem of computing sketches that achieve the optimal tradeoff between sketch size and alignment estimation distortion. We consider the simple setting of i.i.d. error-free sources of length $n$ and introduce a new sketching algorithm called locational hashing. While standard approaches in the literature based on min-hashes require $B = (1/D) cdot Oleft( log n right)$ bits to achieve a distortion $D$, our proposed approach only requires $B = log^2(1/D) cdot O(1)$ bits. This can lead to significant computational savings in pairwise alignment estimation.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا