An Upper Bound on the Convergence Rate of a Second Functional in Optimal Sequence Alignment

279 0 0.0 ( 0 )

Download Cite

Added by Ionel Popescu

Publication date 2014

fields

and research's language is English

Authors Raphael Hauser - Heinrich Matzinger - Ionel Popescu

Probability

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Consider finite sequences $X_{[1,n]}=X_1dots X_n$ and $Y_{[1,n]}=Y_1dots Y_n$ of length $n$, consisting of i.i.d. samples of random letters from a finite alphabet, and let $S$ and $T$ be chosen i.i.d. randomly from the unit ball in the space of symmetric scoring functions over this alphabet augmented by a gap symbol. We prove a probabilistic upper bound of linear order in $n^{0.75}$ for the deviation of the score relative to $T$ of optimal alignments with gaps of $X_{[1,n]}$ and $Y_{[1,n]}$ relative to $S$. It remains an open problem to prove a lower bound. Our result contributes to the understanding of the microstructure of optimal alignments relative to one given scoring function, extending a theory begun by the first two authors.

rate research

A Homotopic Method to Solve the Lasso Problems with an Improved Upper Bound of Convergence Rate

128 - Yujie Zhao , Xiaoming Huo 2020

In optimization, it is known that when the objective functions are strictly convex and well-conditioned, gradient based approaches can be extremely effective, e.g., achieving the exponential rate in convergence. On the other hand, the existing Lasso-type of estimator in general cannot achieve the optimal rate due to the undesirable behavior of the absolute function at the origin. A homotopic method is to use a sequence of surrogate functions to approximate the $ell_1$ penalty that is used in the Lasso-type of estimators. The surrogate functions will converge to the $ell_1$ penalty in the Lasso estimator. At the same time, each surrogate function is strictly convex, which enables provable faster numerical rate of convergence. In this paper, we demonstrate that by meticulously defining the surrogate functions, one can prove faster numerical convergence rate than any existing methods in computing for the Lasso-type of estimators. Namely, the state-of-the-art algorithms can only guarantee $O(1/epsilon)$ or $O(1/sqrt{epsilon})$ convergence rates, while we can prove an $O([log(1/epsilon)]^2)$ for the newly proposed algorithm. Our numerical simulations show that the new algorithm also performs better empirically.

Machine Learning Machine Learning Computation

Central Limit Theorem in High Dimensions : The Optimal Bound on Dimension Growth Rate

83 - Debraj Das , Soumendra Lahiri 2020

In this article, we try to give an answer to the simple question: ``textit{What is the critical growth rate of the dimension $p$ as a function of the sample size $n$ for which the Central Limit Theorem holds uniformly over the collection of $p$-dimensional hyper-rectangles ?}. Specifically, we are interested in the normal approximation of suitably scal

Probability

A temporal perspective on the rate of convergence in first-passage percolation under a moment condition

282 - Daniel Ahlberg 2014

We study the rate of convergence in the Shape Theorem of first-passage percolation, obtaining the precise asymptotic rate of decay for the probability of linear order deviations under a moment condition. Our results are stated for a given time and complements recent work by the same author, in which the rate of convergence was studied from the standard spatial perspective.

Probability

An upper bound on Jacobi polynomials

108 - Ilia Krasikov 2006

Let ${bf P}_k^{(alpha, beta)} (x)$ be an orthonormal Jacobi polynomial of degree $k.$ We will establish the following inequality begin{equation*} max_{x in [delta_{-1},delta_1]}sqrt{(x- delta_{-1})(delta_1-x)} (1-x)^{alpha}(1+x)^{beta} ({bf P}_{k}^{(alpha, beta)} (x))^2 < frac{3 sqrt{5}}{5}, end{equation*} where $delta_{-1}<delta_1$ are appropriate approximations to the extreme zeros of ${bf P}_k^{(alpha, beta)} (x) .$ As a corollary we confirm, even in a stronger form, T. Erd{e}lyi, A.P. Magnus and P. Nevai conjecture [Erd{e}lyi et al., Generalized Jacobi weights, Christoffel functions, and Jacobi polynomials, SIAM J. Math. Anal. 25 (1994), 602-614], by proving that begin{equation*} max_{x in [-1,1]}(1-x)^{alpha+{1/2}}(1+x)^{beta+{1/2}}({bf P}_k^{(alpha, beta)} (x))^2 < 3 alpha^{1/3} (1+ frac{alpha}{k})^{1/6}, end{equation*} in the region $k ge 6, alpha, beta ge frac{1+ sqrt{2}}{4}.$

Classical Analysis and ODEs

Sketching and Sequence Alignment: A Rate-Distortion Perspective

79 - Ilan Shomorony , Govinda M. Kamath 2021

Pairwise alignment of DNA sequencing data is a ubiquitous task in bioinformatics and typically represents a heavy computational burden. A standard approach to speed up this task is to compute sketches of the DNA reads (typically via hashing-based techniques) that allow the efficient computation of pairwise alignment scores. We propose a rate-distortion framework to study the problem of computing sketches that achieve the optimal tradeoff between sketch size and alignment estimation distortion. We consider the simple setting of i.i.d. error-free sources of length $n$ and introduce a new sketching algorithm called locational hashing. While standard approaches in the literature based on min-hashes require $B = (1/D) cdot Oleft( log n right)$ bits to achieve a distortion $D$, our proposed approach only requires $B = log^2(1/D) cdot O(1)$ bits. This can lead to significant computational savings in pairwise alignment estimation.

Information Theory Information Theory