No Arabic abstract
Spectral algorithms, such as principal component analysis and spectral clustering, typically require careful data transformations to be effective: upon observing a matrix $A$, one may look at the spectrum of $psi(A)$ for a properly chosen $psi$. The issue is that the spectrum of $A$ might be contaminated by non-informational top eigenvalues, e.g., due to scale` variations in the data, and the application of $psi$ aims to remove these. Designing a good functional $psi$ (and establishing what good means) is often challenging and model dependent. This paper proposes a simple and generic construction for sparse graphs, $$psi(A) = 1((I+A)^r ge1),$$ where $A$ denotes the adjacency matrix and $r$ is an integer (less than the graph diameter). This produces a graph connecting vertices from the original graph that are within distance $r$, and is referred to as graph powering. It is shown that graph powering regularizes the graph and decontaminates its spectrum in the following sense: (i) If the graph is drawn from the sparse ErdH{o}s-Renyi ensemble, which has no spectral gap, it is shown that graph powering produces a `maximal spectral gap, with the latter justified by establishing an Alon-Boppana result for powered graphs; (ii) If the graph is drawn from the sparse SBM, graph powering is shown to achieve the fundamental limit for weak recovery (the KS threshold) similarly to cite{massoulie-STOC}, settling an open problem therein. Further, graph powering is shown to be significantly more robust to tangles and cliques than previous spectral algorithms based on self-avoiding or nonbacktracking walk counts cite{massoulie-STOC,Mossel_SBM2,bordenave,colin3}. This is illustrated on a geometric block model that is dense in cliques.
We study the problem of finding a spanning forest in an undirected, $n$-vertex multi-graph under two basic query models. One is the Linear query model which are linear measurements on the incidence vector induced by the edges; the other is the weaker OR query model which only reveals whether a given subset of plausible edges is empty or not. At the heart of our study lies a fundamental problem which we call the {em single element recovery} problem: given a non-negative real vector $x$ in $N$ dimension, return a single element $x_j > 0$ from the support. Queries can be made in rounds, and our goals is to understand the trade-offs between the query complexity and the rounds of adaptivity needed to solve these problems, for both deterministic and randomized algorithms. These questions have connections and ramifications to multiple areas such as sketching, streaming, graph reconstruction, and compressed sensing. Our main results are: * For the single element recovery problem, it is easy to obtain a deterministic, $r$-round algorithm which makes $(N^{1/r}-1)$-queries per-round. We prove that this is tight: any $r$-round deterministic algorithm must make $geq (N^{1/r} - 1)$ linear queries in some round. In contrast, a $1$-round $O(log^2 N)$-query randomized algorithm which succeeds 99% of the time is known to exist. * We design a deterministic $O(r)$-round, $tilde{O}(n^{1+1/r})$-OR query algorithm for graph connectivity. We complement this with an $tilde{Omega}(n^{1 + 1/r})$-lower bound for any $r$-round deterministic algorithm in the OR-model. * We design a randomized, $2$-round algorithm for the graph connectivity problem which makes $tilde{O}(n)$-OR queries. In contrast, we prove that any $1$-round algorithm (possibly randomized) requires $tilde{Omega}(n^2)$-OR queries.
In general, a graph modification problem is defined by a graph modification operation $boxtimes$ and a target graph property ${cal P}$. Typically, the modification operation $boxtimes$ may be vertex removal}, edge removal}, edge contraction}, or edge addition and the question is, given a graph $G$ and an integer $k$, whether it is possible to transform $G$ to a graph in ${cal P}$ after applying $k$ times the operation $boxtimes$ on $G$. This problem has been extensively studied for particilar instantiations of $boxtimes$ and ${cal P}$. In this paper we consider the general property ${cal P}_{{phi}}$ of being planar and, moreover, being a model of some First-Order Logic sentence ${phi}$ (an FOL-sentence). We call the corresponding meta-problem Graph $boxtimes$-Modification to Planarity and ${phi}$ and prove the following algorithmic meta-theorem: there exists a function $f:Bbb{N}^{2}toBbb{N}$ such that, for every $boxtimes$ and every FOL sentence ${phi}$, the Graph $boxtimes$-Modification to Planarity and ${phi}$ is solvable in $f(k,|{phi}|)cdot n^2$ time. The proof constitutes a hybrid of two different classic techniques in graph algorithms. The first is the irrelevant vertex technique that is typically used in the context of Graph Minors and deals with properties such as planarity or surface-embeddability (that are not FOL-expressible) and the second is the use of Gaifmans Locality Theorem that is the theoretical base for the meta-algorithmic study of FOL-expressible problems.
This paper formalizes connections between stability of polynomials and convergence rates of Markov Chain Monte Carlo (MCMC) algorithms. We prove that if a (multivariate) partition function is nonzero in a region around a real point $lambda$ then spectral independence holds at $lambda$. As a consequence, for Holant-type problems (e.g., spin systems) on bounded-degree graphs, we obtain optimal $O(nlog n)$ mixing time bounds for the single-site update Markov chain known as the Glauber dynamics. Our result significantly improves the running time guarantees obtained via the polynomial interpolation method of Barvinok (2017), refined by Patel and Regts (2017). There are a variety of applications of our results. In this paper, we focus on Holant-type (i.e., edge-coloring) problems, including weighted edge covers and weighted even subgraphs. For the weighted edge cover problem (and several natural generalizations) we obtain an $O(nlog{n})$ sampling algorithm on bounded-degree graphs. The even subgraphs problem corresponds to the high-temperature expansion of the ferromagnetic Ising model. We obtain an $O(nlog{n})$ sampling algorithm for the ferromagnetic Ising model with a nonzero external field on bounded-degree graphs, which improves upon the classical result of Jerrum and Sinclair (1993) for this class of graphs. We obtain further applications to antiferromagnetic two-spin models on line graphs, weighted graph homomorphisms, tensor networks, and more.
Massive sizes of real-world graphs, such as social networks and web graph, impose serious challenges to process and perform analytics on them. These issues can be resolved by working on a small summary of the graph instead . A summary is a compressed version of the graph that removes several details, yet preserves its essential structure. Generally, some predefined quality measure of the summary is optimized to bound the approximation error incurred by working on the summary instead of the whole graph. All known summarization algorithms are computationally prohibitive and do not scale to large graphs. In this paper we present an efficient randomized algorithm to compute graph summaries with the goal to minimize reconstruction error. We propose a novel weighted sampling scheme to sample vertices for merging that will result in the least reconstruction error. We provide analytical bounds on the running time of the algorithm and prove approximation guarantee for our score computation. Efficiency of our algorithm makes it scalable to very large graphs on which known algorithms cannot be applied. We test our algorithm on several real world graphs to empirically demonstrate the quality of summaries produced and compare to state of the art algorithms. We use the summaries to answer several structural queries about original graph and report their accuracies.
A bond of a graph $G$ is an inclusion-wise minimal disconnecting set of $G$, i.e., bonds are cut-sets that determine cuts $[S,Vsetminus S]$ of $G$ such that $G[S]$ and $G[Vsetminus S]$ are both connected. Given $s,tin V(G)$, an $st$-bond of $G$ is a bond whose removal disconnects $s$ and $t$. Contrasting with the large number of studies related to maximum cuts, there are very few results regarding the largest bond of general graphs. In this paper, we aim to reduce this gap on the complexity of computing the largest bond and the largest $st$-bond of a graph. Although cuts and bonds are similar, we remark that computing the largest bond of a graph tends to be harder than computing its maximum cut. We show that {sc Largest Bond} remains NP-hard even for planar bipartite graphs, and it does not admit a constant-factor approximation algorithm, unless $P = NP$. We also show that {sc Largest Bond} and {sc Largest $st$-Bond} on graphs of clique-width $w$ cannot be solved in time $f(w)times n^{o(w)}$ unless the Exponential Time Hypothesis fails, but they can be solved in time $f(w)times n^{O(w)}$. In addition, we show that both problems are fixed-parameter tractable when parameterized by the size of the solution, but they do not admit polynomial kernels unless NP $subseteq$ coNP/poly.