Fast Computation of Abelian Runs

35 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Gabriele Fici

تاريخ النشر 2015

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Gabriele Fici - Tomasz Kociumaka - Thierry Lecroq

بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Given a word $w$ and a Parikh vector $mathcal{P}$, an abelian run of period $mathcal{P}$ in $w$ is a maximal occurrence of a substring of $w$ having abelian period $mathcal{P}$. Our main result is an online algorithm that, given a word $w$ of length $n$ over an alphabet of cardinality $sigma$ and a Parikh vector $mathcal{P}$, returns all the abelian runs of period $mathcal{P}$ in $w$ in time $O(n)$ and space $O(sigma+p)$, where $p$ is the norm of $mathcal{P}$, i.e., the sum of its components. We also present an online algorithm that computes all the abelian runs with periods of norm $p$ in $w$ in time $O(np)$, for any given norm $p$. Finally, we give an $O(n^2)$-time offline randomized algorithm for computing all the abelian runs of $w$. Its deterministic counterpart runs in $O(n^2logsigma)$ time.

قيم البحث

35 - Yi-Jun Chang , Manuela Fischer , Mohsen Ghaffari 2018

We present new randomized algorithms that improve the complexity of the classic $(Delta+1)$-coloring problem, and its generalization $(Delta+1)$-list-coloring, in three well-studied models of distributed, parallel, and centralized computation: Dist ributed Congested Clique: We present an $O(1)$-round randomized algorithm for $(Delta+1)$-list coloring in the congested clique model of distributed computing. This settles the asymptotic complexity of this problem. It moreover improves upon the $O(log^ast Delta)$-round randomized algorithms of Parter and Su [DISC18] and $O((loglog Delta)cdot log^ast Delta)$-round randomized algorithm of Parter [ICALP18]. Massively Parallel Computation: We present a $(Delta+1)$-list coloring algorithm with round complexity $O(sqrt{loglog n})$ in the Massively Parallel Computation (MPC) model with strongly sublinear memory per machine. This algorithm uses a memory of $O(n^{alpha})$ per machine, for any desirable constant $alpha>0$, and a total memory of $widetilde{O}(m)$, where $m$ is the size of the graph. Notably, this is the first coloring algorithm with sublogarithmic round complexity, in the sublinear memory regime of MPC. For the quasilinear memory regime of MPC, an $O(1)$-round algorithm was given very recently by Assadi et al. [SODA19]. Centralized Local Computation: We show that $(Delta+1)$-list coloring can be solved with $Delta^{O(1)} cdot O(log n)$ query complexity, in the centralized local computation model. The previous state-of-the-art for $(Delta+1)$-list coloring in the centralized local computation model are based on simulation of known LOCAL algorithms.

بنى وهياكل البيانات والخوارزميات

Efficient On-line Computation of Visibility Graphs

145 - Delia Fano Yela , Florian Thalmann , Vincenzo Nicosia 2019

A visibility algorithm maps time series into complex networks following a simple criterion. The resulting visibility graph has recently proven to be a powerful tool for time series analysis. However its straightforward computation is time-consuming a nd rigid, motivating the development of more efficient algorithms. Here we present a highly efficient method to compute visibility graphs with the further benefit of flexibility: on-line computation. We propose an encoder/decoder approach, with an on-line adjustable binary search tree codec for time series as well as its corresponding decoder for visibility graphs. The empirical evidence suggests the proposed method for computation of visibility graphs offers an on-line computation solution at no additional computation time cost. The source code is available online.

بنى وهياكل البيانات والخوارزميات

Skyline Computation with Noisy Comparisons

206 - Beno^it Groz , Frederik Mallmann-Trenn , Claire Mathieu 2017

Given a set of $n$ points in a $d$-dimensional space, we seek to compute the skyline, i.e., those points that are not strictly dominated by any other point, using few comparisons between elements. We adopt the noisy comparison model [FRPU94] where co mparisons fail with constant probability and confidence can be increased through independent repetitions of a comparison. In this model motivated by Crowdsourcing applications, Groz & Milo [GM15] show three bounds on the query complexity for the skyline problem. We improve significantly on that state of the art and provide two output-sensitive algorithms computing the skyline with respective query complexity $O(ndlog (dk/delta))$ and $O(ndklog (k/delta))$ where $k$ is the size of the skyline and $delta$ the expected probability that our algorithm fails to return the correct answer. These results are tight for low dimensions.

بنى وهياكل البيانات والخوارزميات

Simulation computation in grammar-compressed graphs

257 - Stefan Bottcher , Rita Hartel , Sven Peeters 2020

Like [1], we present an algorithm to compute the simulation of a query pattern in a graph of labeled nodes and unlabeled edges. However, our algorithm works on a compressed graph grammar, instead of on the original graph. The speed-up of our algorith m compared to the algorithm in [1] grows with the size of the graph and with the compression strength.

بنى وهياكل البيانات والخوارزميات قواعد البيانات

Efficient Computation of Positional Population Counts Using SIMD Instructions

73 - Marcus D. R. Klarqvist , Wojciech Mu{l}a , Daniel Lemire 2019

In several fields such as statistics, machine learning, and bioinformatics, categorical variables are frequently represented as one-hot encoded vectors. For example, given 8 distinct values, we map each value to a byte where only a single bit has bee n set. We are motivated to quickly compute statistics over such encodings. Given a stream of k-bit words, we seek to compute k distinct sums corresponding to bit values at indexes 0, 1, 2, ..., k-1. If the k-bit words are one-hot encoded then the sums correspond to a frequency histogram. This multiple-sum problem is a generalization of the population-count problem where we seek the sum of all bit values. Accordingly, we refer to the multiple-sum problem as a positional population-count. Using SIMD (Single Instruction, Multiple Data) instructions from recent Intel processors, we describe algorithms for computing the 16-bit position population count using less than half of a CPU cycle per 16-bit word. Our best approach uses up to 400 times fewer instructions and is up to 50 times faster than baseline code using only regular (non-SIMD) instructions, for sufficiently large inputs.

بنى وهياكل البيانات والخوارزميات

سجل دخول لتتمكن من نشر تعليقات