ﻻ يوجد ملخص باللغة العربية
In this paper, we consider the problem of identifying patterns of interest in colored strings. A colored string is a string where each position is assigned one of a finite set of colors. Our task is to find substrings of the colored string that always occur followed by the same color at the same distance. The problem is motivated by applications in embedded systems verification, in particular, assertion mining. The goal there is to automatically find properties of the embedded system from the analysis of its simulation traces. We show that, in our setting, the number of patterns of interest is upper-bounded by $mathcal{O}(n^2)$, where $n$ is the length of the string. We introduce a baseline algorithm, running in $mathcal{O}(n^2)$ time, which identifies all patterns of interest satisfying certain minimality conditions, for all colors in the string. For the case where one is interested in patterns related to one color only, we also provide a second algorithm which runs in $mathcal{O}(n^2log n)$ time in the worst case but is faster than the baseline algorithm in practice. Both solutions use suffix trees, and the second algorithm also uses an appropriately defined priority queue, which allows us to reduce the number of computations. We performed an experimental evaluation of the proposed approaches over both synthetic and real-world datasets, and found that the second algorithm outperforms the first algorithm on all simulated data, while on the real-world data, the performance varies between a slight slowdown (on half of the datasets) and a speedup by a factor of up to 11.
Given an indeterminate string pattern $p$ and an indeterminate string text $t$, the problem of order-preserving pattern matching with character uncertainties ($mu$OPPM) is to find all substrings of $t$ that satisfy one of the possible orderings defin
The Karp-Rabin fingerprint of a string is a type of hash value that due to its strong properties has been used in many string algorithms. In this paper we show how to construct a data structure for a string $S$ of size $N$ compressed by a context-fre
We investigate the computational complexity of the following problem. We are given a graph in which each vertex has an initial and a target color. Each pair of adjacent vertices can swap their current colors. Our goal is to perform the minimum number
We present a number of new results about range searching for colored (or categorical) data: 1. For a set of $n$ colored points in three dimensions, we describe randomized data structures with $O(nmathop{rm polylog}n)$ space that can report the dist
A string $S[1,n]$ is a power (or tandem repeat) of order $k$ and period $n/k$ if it can decomposed into $k$ consecutive equal-length blocks of letters. Powers and periods are fundamental to string processing, and algorithms for their efficient comput