Efficient On-line Computation of Visibility Graphs

146 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Delia Fano Yela

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Delia Fano Yela - Florian Thalmann - Vincenzo Nicosia

بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

A visibility algorithm maps time series into complex networks following a simple criterion. The resulting visibility graph has recently proven to be a powerful tool for time series analysis. However its straightforward computation is time-consuming and rigid, motivating the development of more efficient algorithms. Here we present a highly efficient method to compute visibility graphs with the further benefit of flexibility: on-line computation. We propose an encoder/decoder approach, with an on-line adjustable binary search tree codec for time series as well as its corresponding decoder for visibility graphs. The empirical evidence suggests the proposed method for computation of visibility graphs offers an on-line computation solution at no additional computation time cost. The source code is available online.

قيم البحث

164 - Yuya Sasaki , Yasuhiro Fujiwara , Makoto Onizuka 2020

Network reliability is an important metric to evaluate the connectivity among given vertices in uncertain graphs. Since the network reliability problem is known as #P-complete, existing studies have used approximation techniques. In this paper, we pr opose a new sampling-based approach that efficiently and accurately approximates network reliability. Our approach improves efficiency by reducing the number of samples based on stratified sampling. We theoretically guarantee that our approach improves the accuracy of approximation by using lower and upper bounds of network reliability, even though it reduces the number of samples. To efficiently compute the bounds, we develop an extended BDD, called S2BDD. During constructing the S2BDD, our approach employs dynamic programming for efficiently sampling possible graphs. Our experiment with real datasets demonstrates that our approach is up to 51.2 times faster than the existing sampling-based approach with higher accuracy.

بنى وهياكل البيانات والخوارزميات قواعد البيانات

Spectral Lower Bounds on the I/O Complexity of Computation Graphs

163 - Saachi Jain , Matei Zaharia 2019

We consider the problem of finding lower bounds on the I/O complexity of arbitrary computations in a two level memory hierarchy. Executions of complex computations can be formalized as an evaluation order over the underlying computation graph. Howeve r, prior methods for finding I/O lower bounds leverage the graph structures for specific problems (e.g matrix multiplication) which cannot be applied to arbitrary graphs. In this paper, we first present a novel method to bound the I/O of any computation graph using the first few eigenvalues of the graphs Laplacian. We further extend this bound to the parallel setting. This spectral bound is not only efficiently computable by power iteration, but can also be computed in closed form for graphs with known spectra. We apply our spectral method to compute closed-form analytical bounds on two computation graphs (the Bellman-Held-Karp algorithm for the traveling salesman problem and the Fast Fourier Transform), as well as provide a probabilistic bound for random Erdos Renyi graphs. We empirically validate our bound on four computation graphs, and find that our method provides tighter bounds than current empirical methods and behaves similarly to previously published I/O bounds.

بنى وهياكل البيانات والخوارزميات

PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs

321 - Zhewei Wei , Xiaodong He , Xiaokui Xiao 2019

{it SimRank} is a classic measure of the similarities of nodes in a graph. Given a node $u$ in graph $G =(V, E)$, a {em single-source SimRank query} returns the SimRank similarities $s(u, v)$ between node $u$ and each node $v in V$. This type of quer ies has numerous applications in web search and social networks analysis, such as link prediction, web mining, and spam detection. Existing methods for single-source SimRank queries, however, incur query cost at least linear to the number of nodes $n$, which renders them inapplicable for real-time and interactive analysis. { This paper proposes prsim, an algorithm that exploits the structure of graphs to efficiently answer single-source SimRank queries. prsim uses an index of size $O(m)$, where $m$ is the number of edges in the graph, and guarantees a query time that depends on the {em reverse PageRank} distribution of the input graph. In particular, we prove that prsim runs in sub-linear time if the degree distribution of the input graph follows the power-law distribution, a property possessed by many real-world graphs. Based on the theoretical analysis, we show that the empirical query time of all existing SimRank algorithms also depends on the reverse PageRank distribution of the graph.} Finally, we present the first experimental study that evaluates the absolute errors of various SimRank algorithms on large graphs, and we show that prsim outperforms the state of the art in terms of query time, accuracy, index size, and scalability.

بنى وهياكل البيانات والخوارزميات

Simulation computation in grammar-compressed graphs

257 - Stefan Bottcher , Rita Hartel , Sven Peeters 2020

Like [1], we present an algorithm to compute the simulation of a query pattern in a graph of labeled nodes and unlabeled edges. However, our algorithm works on a compressed graph grammar, instead of on the original graph. The speed-up of our algorith m compared to the algorithm in [1] grows with the size of the graph and with the compression strength.

بنى وهياكل البيانات والخوارزميات قواعد البيانات

Efficient Computation of Positional Population Counts Using SIMD Instructions

73 - Marcus D. R. Klarqvist , Wojciech Mu{l}a , Daniel Lemire 2019

In several fields such as statistics, machine learning, and bioinformatics, categorical variables are frequently represented as one-hot encoded vectors. For example, given 8 distinct values, we map each value to a byte where only a single bit has bee n set. We are motivated to quickly compute statistics over such encodings. Given a stream of k-bit words, we seek to compute k distinct sums corresponding to bit values at indexes 0, 1, 2, ..., k-1. If the k-bit words are one-hot encoded then the sums correspond to a frequency histogram. This multiple-sum problem is a generalization of the population-count problem where we seek the sum of all bit values. Accordingly, we refer to the multiple-sum problem as a positional population-count. Using SIMD (Single Instruction, Multiple Data) instructions from recent Intel processors, we describe algorithms for computing the 16-bit position population count using less than half of a CPU cycle per 16-bit word. Our best approach uses up to 400 times fewer instructions and is up to 50 times faster than baseline code using only regular (non-SIMD) instructions, for sufficiently large inputs.

بنى وهياكل البيانات والخوارزميات

سجل دخول لتتمكن من نشر تعليقات