Constant-factor approximation of near-linear edit distance in near-linear time

127 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Joshua Brakensiek

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Joshua Brakensiek - Aviad Rubinstein

بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We show that the edit distance between two strings of length $n$ can be computed within a factor of $f(epsilon)$ in $n^{1+epsilon}$ time as long as the edit distance is at least $n^{1-delta}$ for some $delta(epsilon) > 0$.

قيم البحث

58 - Diptarka Chakraborty , Debarati Das , Elazar Goldenberg 2018

Edit distance is a measure of similarity of two strings based on the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. The edit distance can be computed exactly using a dynamic progr amming algorithm that runs in quadratic time. Andoni, Krauthgamer, and Onak (2010) gave a nearly linear time algorithm that approximates edit distance within an approximation factor $text{poly}(log n)$. In this paper, we provide an algorithm with running time $tilde{O}(n^{2-2/7})$ that approximates the edit distance within a constant factor.

بنى وهياكل البيانات والخوارزميات

$ell_2$-norm Flow Diffusion in Near-Linear Time

84 - Li Chen , Richard Peng , 2021

Diffusion is a fundamental graph procedure and has been a basic building block in a wide range of theoretical and empirical applications such as graph partitioning and semi-supervised learning on graphs. In this paper, we study computationally effici ent diffusion primitives beyond random walk. We design an $widetilde{O}(m)$-time randomized algorithm for the $ell_2$-norm flow diffusion problem, a recently proposed diffusion model based on network flow with demonstrated graph clustering related applications both in theory and in practice. Examples include finding locally-biased low conductance cuts. Using a known connection between the optimal dual solution of the flow diffusion problem and the local cut structure, our algorithm gives an alternative approach for finding such cuts in nearly linear time. From a technical point of view, our algorithm contributes a novel way of dealing with inequality constraints in graph optimization problems. It adapts the high-level algorithmic framework of nearly linear time Laplacian system solvers, but requires several new tools: vertex elimination under constraints, a new family of graph ultra-sparsifiers, and accelerated proximal gradient methods with inexact proximal mapping computation.

بنى وهياكل البيانات والخوارزميات التعلم الآلي

Scheduling with Communication Delay in Near-Linear Time

194 - Quanquan C. Liu , Manish Purohit , Zoya Svitkina 2021

We consider the problem of efficiently scheduling jobs with precedence constraints on a set of identical machines in the presence of a uniform communication delay. In this setting, if two precedence-constrained jobs $u$ and $v$, with ($u prec v$), ar e scheduled on different machines, then $v$ must start at least $rho$ time units after $u$ completes. The scheduling objective is to minimize makespan, i.e. the total time between when the first job starts and the last job completes. The focus of this paper is to provide an efficient approximation algorithm with near-linear running time. We build on the algorithm of Lepere and Rapine [STACS 2002] for this problem to give an $Oleft(frac{ln rho}{ln ln rho} right)$-approximation algorithm that runs in $tilde{O}(|V| + |E|)$ time.

بنى وهياكل البيانات والخوارزميات

Clustering under Perturbation Stability in Near-Linear Time

303 - Pankaj K. Agarwal , Hsien-Chih Chang , Kamesh Munagala 2020

We consider the problem of center-based clustering in low-dimensional Euclidean spaces under the perturbation stability assumption. An instance is $alpha$-stable if the underlying optimal clustering continues to remain optimal even when all pairwise distances are arbitrarily perturbed by a factor of at most $alpha$. Our main contribution is in presenting efficient exact algorithms for $alpha$-stable clustering instances whose running times depend near-linearly on the size of the data set when $alpha ge 2 + sqrt{3}$. For $k$-center and $k$-means problems, our algorithms also achieve polynomial dependence on the number of clusters, $k$, when $alpha geq 2 + sqrt{3} + epsilon$ for any constant $epsilon > 0$ in any fixed dimension. For $k$-median, our algorithms have polynomial dependence on $k$ for $alpha > 5$ in any fixed dimension; and for $alpha geq 2 + sqrt{3}$ in two dimensions. Our algorithms are simple, and only require applying techniques such as local search or dynamic programming to a suitably modified metric space, combined with careful choice of data structures.

بنى وهياكل البيانات والخوارزميات الهندسة الحسابية

Near Linear Time Approximation Schemes for Uncapacitated and Capacitated b--Matching Problems in Nonbipartite Graphs

246 - Kook Jin Ahn , Sudipto Guha 2013

We present the first near optimal approximation schemes for the maximum weighted (uncapacitated or capacitated) $b$--matching problems for non-bipartite graphs that run in time (near) linear in the number of edges. For any $delta>3/sqrt{n}$ the algorithm produces a $(1-delta)$ approximation in $O(m poly(delta^{-1},log n))$ time. We provide fractional solutions for the standard linear programming formulations for these problems and subsequently also provide (near) linear time approximation schemes for rounding the fractional solutions. Through these problems as a vehicle, we also present several ideas in the context of solving linear programs approximately using fast primal-dual algorithms. First, even though the dual of these problems have exponentially many variables and an efficient exact computation of dual weights is infeasible, we show that we can efficiently compute and use a sparse approximation of the dual weights using a combination of (i) adding perturbation to the constraints of the polytope and (ii) amplification followed by thresholding of the dual weights. Second, we show that approximation algorithms can be used to reduce the width of the formulation, and faster convergence.

بنى وهياكل البيانات والخوارزميات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد العالي لإدارة الأعمال

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Constant-factor approximation of near-linear edit distance in near-linear time

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

We show that the edit distance between two strings of length $n$ can be computed within a factor of $f(epsilon)$ in $n^{1+epsilon}$ time as long as the edit distance is at least $n^{1-delta}$ for some $delta(epsilon) > 0$.

اقرأ أيضاً