No Arabic abstract
We present a sublinear time algorithm that allows one to sample multiple edges from a distribution that is pointwise $epsilon$-close to the uniform distribution, in an emph{amortized-efficient} fashion. We consider the adjacency list query model, where access to a graph $G$ is given via degree and neighbor queries. The problem of sampling a single edge in this model has been raised by Eden and Rosenbaum (SOSA 18). Let $n$ and $m$ denote the number of vertices and edges of $G$, respectively. Eden and Rosenbaum provided upper and lower bounds of $Theta^*(n/sqrt m)$ for sampling a single edge in general graphs (where $O^*(cdot)$ suppresses $textrm{poly}(1/epsilon)$ and $textrm{poly}(log n)$ dependencies). We ask whether the query complexity lower bound for sampling a single edge can be circumvented when multiple samples are required. That is, can we get an improved amortized per-sample cost if we allow a preprocessing phase? We answer in the affirmative. We present an algorithm that, if one knows the number of required samples $q$ in advance, has an overall cost that is sublinear in $q$, namely, $O^*(sqrt q cdot(n/sqrt m))$, which is strictly preferable to $O^*(qcdot (n/sqrt m))$ cost resulting from $q$ invocations of the algorithm by Eden and Rosenbaum. Subsequent to a preliminary version of this work, Tv{e}tek and Thorup (arXiv, preprint) proved that this bound is essentially optimal.
We present a data structure that can maintain a simple planar graph under edge contractions in linear total time. The data structure supports adjacency queries and provides access to neighbor lists in $O(1)$ time. Moreover, it can report all the arising self-loops and parallel edges. By applying the data structure, we can achieve optimal running times for decremental bridge detection, 2-edge connectivity, maximal 3-edge connected components, and the problem of finding a unique perfect matching for a static planar graph. Furthermore, we improve the running times of algorithms for several planar graph problems, including decremental 2-vertex and 3-edge connectivity, and we show that using our data structure in a black-box manner, one obtains conceptually simple optimal algorithms for computing MST and 5-coloring in planar graphs.
Gaussian processes are the gold standard for many real-world modeling problems, especially in cases where a models success hinges upon its ability to faithfully represent predictive uncertainty. These problems typically exist as parts of larger frameworks, wherein quantities of interest are ultimately defined by integrating over posterior distributions. These quantities are frequently intractable, motivating the use of Monte Carlo methods. Despite substantial progress in scaling up Gaussian processes to large training sets, methods for accurately generating draws from their posterior distributions still scale cubically in the number of test locations. We identify a decomposition of Gaussian processes that naturally lends itself to scalable sampling by separating out the prior from the data. Building off of this factorization, we propose an easy-to-use and general-purpose approach for fast posterior sampling, which seamlessly pairs with sparse approximations to afford scalability both during training and at test time. In a series of experiments designed to test competing sampling schemes statistical properties and practical ramifications, we demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
Minimizing the discrepancy of a set system is a fundamental problem in combinatorics. One of the cornerstones in this area is the celebrated six standard deviations result of Spencer (AMS 1985): In any system of n sets in a universe of size n, there always exists a coloring which achieves discrepancy 6sqrt{n}. The original proof of Spencer was existential in nature, and did not give an efficient algorithm to find such a coloring. Recently, a breakthrough work of Bansal (FOCS 2010) gave an efficient algorithm which finds such a coloring. His algorithm was based on an SDP relaxation of the discrepancy problem and a clever rounding procedure. In this work we give a new randomized algorithm to find a coloring as in Spencers result based on a restricted random walk we call Edge-Walk. Our algorithm and its analysis use only basic linear algebra and is truly constructive in that it does not appeal to the existential arguments, giving a new proof of Spencers theorem and the partial coloring lemma.
An added edge to a graph is called an inset edge. Predicting k inset edges which minimize the average distance of a graph is known to be NP-Hard. When k = 1 the complexity of the problem is polynomial. In this paper, we further find the single inset edge(s) of a tree with the closest change on the average distance to a given input. To do that we may require the effect of each inset edge for the set of inset edges. For this, we propose an algorithm with the time complexity between O(m) and O(m/m) and an average of less than O( m.log(m)), where m stands for the number of possible inset edges. Then it takes up to O(log(m)) to find the target inset edges for a custom change on the average distance. Using theoretical tools, the algorithm strictly avoids recalculating the distances with no changes, after adding a new edge to a tree. Then reduces the time complexity of calculating remaining distances using some matrix tools which first introduced in [8] with one additional technique. This gives us a dynamic time complexity and absolutely depends on the input tree which is proportion to the Wiener index of the input tree.
Finding a minimum vertex cover in a network is a fundamental NP-complete graph problem. One way to deal with its computational hardness, is to trade the qualitative performance of an algorithm (allowing non-optimal outputs) for an improved running time. For the vertex cover problem, there is a gap between theory and practice when it comes to understanding this tradeoff. On the one hand, it is known that it is NP-hard to approximate a minimum vertex cover within a factor of $sqrt{2}$. On the other hand, a simple greedy algorithm yields close to optimal approximations in practice. A promising approach towards understanding this discrepancy is to recognize the differences between theoretical worst-case instances and real-world networks. Following this direction, we close the gap between theory and practice by providing an algorithm that efficiently computes nearly optimal vertex cover approximations on hyperbolic random graphs; a network model that closely resembles real-world networks in terms of degree distribution, clustering, and the small-world property. More precisely, our algorithm computes a $(1 + o(1))$-approximation, asymptotically almost surely, and has a running time of $mathcal{O}(m log(n))$. The proposed algorithm is an adaption of the successful greedy approach, enhanced with a procedure that improves on parts of the graph where greedy is not optimal. This makes it possible to introduce a parameter that can be used to tune the tradeoff between approximation performance and running time. Our empirical evaluation on real-world networks shows that this allows for improving over the near-optimal results of the greedy approach.