Periodic words, common subsequences and frogs


Abstract in English

Let $W^{(n)}$ be the $n$-letter word obtained by repeating a fixed word $W$, and let $R_n$ be a random $n$-letter word over the same alphabet. We show several results about the length of the longest common subsequence (LCS) between $W^{(n)}$ and $R_n$; in particular, we show that its expectation is $gamma_W n-O(sqrt{n})$ for an efficiently-computable constant $gamma_W$. This is done by relating the problem to a new interacting particle system, which we dub frog dynamics. In this system, the particles (`frogs) hop over one another in the order given by their labels. Stripped of the labeling, the frog dynamics reduces to a variant of the PushTASEP. In the special case when all symbols of $W$ are distinct, we obtain an explicit formula for the constant $gamma_W$ and a closed-form expression for the stationary distribution of the associated frog dynamics. In addition, we propose new conjectures about the asymptotic of the LCS of a pair of random words. These conjectures are informed by computer experiments using a new heuristic algorithm to compute the LCS. Through our computations, we found periodic words that are more random-like than a random word, as measured by the LCS.

Download