Sampling Random Spanning Trees Faster than Matrix Multiplication

87 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Rasmus J Kyng

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف David Durfee - Rasmus Kyng - John Peebles

بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in $tilde{O}(n^{4/3}m^{1/2}+n^{2})$ time (The $tilde{O}(cdot)$ notation hides $operatorname{polylog}(n)$ factors). The tree is sampled from a distribution where the probability of each tree is proportional to the product of its edge weights. This improves upon the previous best algorithm due to Colbourn et al. that runs in matrix multiplication time, $O(n^omega)$. For the special case of unweighted graphs, this improves upon the best previously known running time of $tilde{O}(min{n^{omega},msqrt{n},m^{4/3}})$ for $m gg n^{5/3}$ (Colbourn et al. 96, Kelner-Madry 09, Madry et al. 15). The effective resistance metric is essential to our algorithm, as in the work of Madry et al., but we eschew determinant-based and random walk-based techniques used by previous algorithms. Instead, our algorithm is based on Gaussian elimination, and the fact that effective resistance is preserved in the graph resulting from eliminating a subset of vertices (called a Schur complement). As part of our algorithm, we show how to compute $epsilon$-approximate effective resistances for a set $S$ of vertex pairs via approximate Schur complements in $tilde{O}(m+(n + |S|)epsilon^{-2})$ time, without using the Johnson-Lindenstrauss lemma which requires $tilde{O}( min{(m + |S|)epsilon^{-2}, m+nepsilon^{-4} +|S|epsilon^{-2}})$ time. We combine this approximation procedure with an error correction procedure for handing edges where our estimate isnt sufficiently accurate.

قيم البحث

68 - David Durfee , John Peebles , Richard Peng 2017

We show variants of spectral sparsification routines can preserve the total spanning tree counts of graphs, which by Kirchhoffs matrix-tree theorem, is equivalent to determinant of a graph Laplacian minor, or equivalently, of any SDDM matrix. Our ana lyses utilizes this combinatorial connection to bridge between statistical leverage scores / effective resistances and the analysis of random graphs by [Janson, Combinatorics, Probability and Computing `94]. This leads to a routine that in quadratic time, sparsifies a graph down to about $n^{1.5}$ edges in ways that preserve both the determinant and the distribution of spanning trees (provided the sparsified graph is viewed as a random object). Extending this algorithm to work with Schur complements and approximate Choleksy factorizations leads to algorithms for counting and sampling spanning trees which are nearly optimal for dense graphs. We give an algorithm that computes a $(1 pm delta)$ approximation to the determinant of any SDDM matrix with constant probability in about $n^2 delta^{-2}$ time. This is the first routine for graphs that outperforms general-purpose routines for computing determinants of arbitrary matrices. We also give an algorithm that generates in about $n^2 delta^{-2}$ time a spanning tree of a weighted undirected graph from a distribution with total variation distance of $delta$ from the $w$-uniform distribution .

بنى وهياكل البيانات والخوارزميات

Sampling Can Be Faster Than Optimization

106 - Yi-An Ma , Yuansi Chen , Chi Jin 2018

Optimization algorithms and Monte Carlo sampling algorithms have provided the computational foundations for the rapid growth in applications of statistical machine learning in recent years. There is, however, limited theoretical understanding of the relationships between these two kinds of methodology, and limited understanding of relative strengths and weaknesses. Moreover, existing results have been obtained primarily in the setting of convex functions (for optimization) and log-concave functions (for sampling). In this setting, where local properties determine global properties, optimization algorithms are unsurprisingly more efficient computationally than sampling algorithms. We instead examine a class of nonconvex objective functions that arise in mixture modeling and multi-stable systems. In this nonconvex setting, we find that the computational complexity of sampling algorithms scales linearly with the model dimension while that of optimization algorithms scales exponentially.

التعلم الالي التعلم الآلي

Spanning directed trees with many leaves

433 - N Alon , F.V. Fomin , G. Gutin 2008

The {sc Directed Maximum Leaf Out-Branching} problem is to find an out-branching (i.e. a rooted oriented spanning tree) in a given digraph with the maximum number of leaves. In this paper, we obtain two combinatorial results on the number of leaves i n out-branchings. We show that - every strongly connected $n$-vertex digraph $D$ with minimum in-degree at least 3 has an out-branching with at least $(n/4)^{1/3}-1$ leaves; - if a strongly connected digraph $D$ does not contain an out-branching with $k$ leaves, then the pathwidth of its underlying graph UG($D$) is $O(klog k)$. Moreover, if the digraph is acyclic, the pathwidth is at most $4k$. The last result implies that it can be decided in time $2^{O(klog^2 k)}cdot n^{O(1)}$ whether a strongly connected digraph on $n$ vertices has an out-branching with at least $k$ leaves. On acyclic digraphs the running time of our algorithm is $2^{O(klog k)}cdot n^{O(1)}$.

بنى وهياكل البيانات والخوارزميات الرياضيات المتقطعة

On Packing Low-Diameter Spanning Trees

120 - Julia Chuzhoy , Merav Parter , Zihan Tan 2020

Edge connectivity of a graph is one of the most fundamental graph-theoretic concepts. The celebrated tree packing theorem of Tutte and Nash-Williams from 1961 states that every $k$-edge connected graph $G$ contains a collection $cal{T}$ of $lfloor k/ 2 rfloor$ edge-disjoint spanning trees, that we refer to as a tree packing; the diameter of the tree packing $cal{T}$ is the largest diameter of any tree in $cal{T}$. A desirable property of a tree packing, that is both sufficient and necessary for leveraging the high connectivity of a graph in distributed communication, is that its diameter is low. Yet, despite extensive research in this area, it is still unclear how to compute a tree packing, whose diameter is sublinear in $|V(G)|$, in a low-diameter graph $G$, or alternatively how to show that such a packing does not exist. In this paper we provide first non-trivial upper and lower bounds on the diameter of tree packing. First, we show that, for every $k$-edge connected $n$-vertex graph $G$ of diameter $D$, there is a tree packing $cal{T}$ of size $Omega(k)$, diameter $O((101klog n)^D)$, that causes edge-congestion at most $2$. Second, we show that for every $k$-edge connected $n$-vertex graph $G$ of diameter $D$, the diameter of $G[p]$ is $O(k^{D(D+1)/2})$ with high probability, where $G[p]$ is obtained by sampling each edge of $G$ independently with probability $p=Theta(log n/k)$. This provides a packing of $Omega(k/log n)$ edge-disjoint trees of diameter at most $O(k^{(D(D+1)/2)})$ each. We then prove that these two results are nearly tight. Lastly, we show that if every pair of vertices in a graph has $k$ edge-disjoint paths of length at most $D$ connecting them, then there is a tree packing of size $k$, diameter $O(Dlog n)$, causing edge-congestion $O(log n)$. We also provide several applications of low-diameter tree packing in distributed computation.

بنى وهياكل البيانات والخوارزميات

Faster Dynamic Matrix Inverse for Faster LPs

88 - Shunhua Jiang , Zhao Song , Omri Weinstein 2020

Motivated by recent Linear Programming solvers, we design dynamic data structures for maintaining the inverse of an $ntimes n$ real matrix under $textit{low-rank}$ updates, with polynomially faster amortized running time. Our data structure is based on a recursive application of the Woodbury-Morrison identity for implementing $textit{cascading}$ low-rank updates, combined with recent sketching technology. Our techniques and amortized analysis of multi-level partial updates, may be of broader interest to dynamic matrix problems. This data structure leads to the fastest known LP solver for general (dense) linear programs, improving the running time of the recent algorithms of (Cohen et al.19, Lee et al.19, Brand20) from $O^*(n^{2+ max{frac{1}{6}, omega-2, frac{1-alpha}{2}}})$ to $O^*(n^{2+max{frac{1}{18}, omega-2, frac{1-alpha}{2}}})$, where $omega$ and $alpha$ are the fast matrix multiplication exponent and its dual. Hence, under the common belief that $omega approx 2$ and $alpha approx 1$, our LP solver runs in $O^*(n^{2.055})$ time instead of $O^*(n^{2.16})$.

بنى وهياكل البيانات والخوارزميات

سجل دخول لتتمكن من نشر تعليقات