Optimal Dynamic Strings

published by Tomasz Kociumaka in 2015 in Informatics Engineering and research's language is English Download

Abstract in English

In this paper we study the fundamental problem of maintaining a dynamic collection of strings under the following operations: concat - concatenates two strings, split - splits a string into two at a given position, compare - finds the lexicographical order (less, equal, greater) between two strings, LCP - calculates the longest common prefix of two strings. We present an efficient data structure for this problem, where an update requires only $O(log n)$ worst-case time with high probability, with $n$ being the total length of all strings in the collection, and a query takes constant worst-case time. On the lower bound side, we prove that even if the only possible query is checking equality of two strings, either updates or queries take amortized $Omega(log n)$ time; hence our implementation is optimal. Such operations can be used as a basic building block to solve other string problems. We provide two examples. First, we can augment our data structure to provide pattern matching queries that may locate occurrences of a specified pattern $p$ in the strings in our collection in optimal $O(|p|)$ time, at the expense of increasing update time to $O(log^2 n)$. Second, we show how to maintain a history of an edited text, processing updates in $O(log t log log t)$ time, where $t$ is the number of edits, and how to support pattern matching queries against the whole history in $O(|p| log t log log t)$ time. Finally, we note that our data structure can be applied to test dynamic tree isomorphism and to compare strings generated by dynamic straight-line grammars.

Download