أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Tomasz Kociumaka

34 - Gabriele Fici , Tomasz Kociumaka , Thierry Lecroq 2015

Given a word $w$ and a Parikh vector $mathcal{P}$, an abelian run of period $mathcal{P}$ in $w$ is a maximal occurrence of a substring of $w$ having abelian period $mathcal{P}$. Our main result is an online algorithm that, given a word $w$ of length $n$ over an alphabet of cardinality $sigma$ and a Parikh vector $mathcal{P}$, returns all the abelian runs of period $mathcal{P}$ in $w$ in time $O(n)$ and space $O(sigma+p)$, where $p$ is the norm of $mathcal{P}$, i.e., the sum of its components. We also present an online algorithm that computes all the abelian runs with periods of norm $p$ in $w$ in time $O(np)$, for any given norm $p$. Finally, we give an $O(n^2)$-time offline randomized algorithm for computing all the abelian runs of $w$. Its deterministic counterpart runs in $O(n^2logsigma)$ time.

بنى وهياكل البيانات والخوارزميات

Covering Problems for Partial Words and for Indeterminate Strings

111 - Maxime Crochemore , Costas S. Iliopoulos , Tomasz Kociumaka 2014

We consider the problem of computing a shortest solid cover of an indeterminate string. An indeterminate string may contain non-solid symbols, each of which specifies a subset of the alphabet that could be present at the corresponding position. We al so consider covering partial words, which are a special case of indeterminate strings where each non-solid symbol is a dont care symbol. We prove that indeterminate string covering problem and partial word covering problem are NP-complete for binary alphabet and show that both problems are fixed-parameter tractable with respect to $k$, the number of non-solid symbols. For the indeterminate string covering problem we obtain a $2^{O(k log k)} + n k^{O(1)}$-time algorithm. For the partial word covering problem we obtain a $2^{O(sqrt{k}log k)} + nk^{O(1)}$-time algorithm. We prove that, unless the Exponential Time Hypothesis is false, no $2^{o(sqrt{k})} n^{O(1)}$-time solution exists for either problem, which shows that our algorithm for this case is close to optimal. We also present an algorithm for both problems which is feasible in practice.

بنى وهياكل البيانات والخوارزميات

Fast Algorithm for Partial Covers in Words

56 - Tomasz Kociumaka , Jakub Radoszewski , Wojciech Rytter 2013

A factor $u$ of a word $w$ is a cover of $w$ if every position in $w$ lies within some occurrence of $u$ in $w$. A word $w$ covered by $u$ thus generalizes the idea of a repetition, that is, a word composed of exact concatenations of $u$. In this art icle we introduce a new notion of $alpha$-partial cover, which can be viewed as a relaxed variant of cover, that is, a factor covering at least $alpha$ positions in $w$. We develop a data structure of $O(n)$ size (where $n=|w|$) that can be constructed in $O(nlog n)$ time which we apply to compute all shortest $alpha$-partial covers for a given $alpha$. We also employ it for an $O(nlog n)$-time algorithm computing a shortest $alpha$-partial cover for each $alpha=1,2,ldots,n$.

بنى وهياكل البيانات والخوارزميات

A Note on the Longest Common Compatible Prefix Problem for Partial Words

180 - Maxime Crochemore , Costas S. Iliopoulos , Tomasz Kociumaka 2013

For a partial word $w$ the longest common compatible prefix of two positions $i,j$, denoted $lccp(i,j)$, is the largest $k$ such that $w[i,i+k-1]uparrow w[j,j+k-1]$, where $uparrow$ is the compatibility relation of partial words (it is not an equival ence relation). The LCCP problem is to preprocess a partial word in such a way that any query $lccp(i,j)$ about this word can be answered in $O(1)$ time. It is a natural generalization of the longest common prefix (LCP) problem for regular words, for which an $O(n)$ preprocessing time and $O(1)$ query time solution exists. Recently an efficient algorithm for this problem has been given by F. Blanchet-Sadri and J. Lazarow (LATA 2013). The preprocessing time was $O(nh+n)$, where $h$ is the number of holes in $w$. The algorithm was designed for partial words over a constant alphabet and was quite involved. We present a simple solution to this problem with slightly better runtime that works for any linearly-sortable alphabet. Our preprocessing is in time $O(nmu+n)$, where $mu$ is the number of blocks of holes in $w$. Our algorithm uses ideas from alignment algorithms and dynamic programming.

بنى وهياكل البيانات والخوارزميات اللغات الرسمية ونظرية الأتومات

Internal Pattern Matching Queries in a Text and Applications

76 - Tomasz Kociumaka , Jakub Radoszewski , Wojciech Rytter 2013

We consider several types of internal queries: questions about subwords of a text. As the main tool we develop an optimal data structure for the problem called here internal pattern matching. This data structure provides constant-time answers to quer ies about occurrences of one subword $x$ in another subword $y$ of a given text, assuming that $|y|=mathcal{O}(|x|)$, which allows for a constant-space representation of all occurrences. This problem can be viewed as a natural extension of the well-studied pattern matching problem. The data structure has linear size and admits a linear-time construction algorithm. Using the solution to the internal pattern matching problem, we obtain very efficient data structures answering queries about: primitivity of subwords, periods of subwords, general substring compression, and cyclic equivalence of two subwords. All these results improve upon the best previously known counterparts. The linear construction time of our data structure also allows to improve the algorithm for finding $delta$-subrepetitions in a text (a more general version of maximal repetitions, also called runs). For any fixed $delta$ we obtain the first linear-time algorithm, which matches the linear time complexity of the algorithm computing runs. Our data structure has already been used as a part of the efficient solutions for subword suffix rank & selection, as well as substring compression using Burrows-Wheeler transform composed with run-length encoding.

بنى وهياكل البيانات والخوارزميات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد