Staircase patterns in words: subsequences, subwords, and separation number

48 0 0.0 ( 0 )

Download Cite

Added by Reza Rastegar

Publication date 2019

fields

and research's language is English

Authors Toufik Mansour - Reza Rastegar - Alexander Roitershtein

Combinatorics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We revisit staircases for words and prove several exact as well as asymptotic results for longest left-most staircase subsequences and subwords and staircase separation number, the latter being defined as the number of consecutive maximal staircase subwords packed in a word. We study asymptotic properties of the sequence $h_{r,k}(n),$ the number of $n$-array words with $r$ separations over alphabet $[k]$ and show that for any $rgeq 0,$ the growth sequence $big(h_{r,k}(n)big)^{1/n}$ converges to a characterized limit, independent of $r.$ In addition, we study the asymptotic behavior of the random variable $mathcal{S}_k(n),$ the number of staircase separations in a random word in $[k]^n$ and obtain several limit theorems for the distribution of $mathcal{S}_k(n),$ including a law of large numbers, a central limit theorem, and the exact growth rate of the entropy of $mathcal{S}_k(n).$ Finally, we obtain similar results, including growth limits, for longest $L$-staircase subwords and subsequences.

rate research

Periodic words, common subsequences and frogs

354 - Boris Bukh , Christopher Cox 2019

Let $W^{(n)}$ be the $n$-letter word obtained by repeating a fixed word $W$, and let $R_n$ be a random $n$-letter word over the same alphabet. We show several results about the length of the longest common subsequence (LCS) between $W^{(n)}$ and $R_n$; in particular, we show that its expectation is $gamma_W n-O(sqrt{n})$ for an efficiently-computable constant $gamma_W$. This is done by relating the problem to a new interacting particle system, which we dub frog dynamics. In this system, the particles (`frogs) hop over one another in the order given by their labels. Stripped of the labeling, the frog dynamics reduces to a variant of the PushTASEP. In the special case when all symbols of $W$ are distinct, we obtain an explicit formula for the constant $gamma_W$ and a closed-form expression for the stationary distribution of the associated frog dynamics. In addition, we propose new conjectures about the asymptotic of the LCS of a pair of random words. These conjectures are informed by computer experiments using a new heuristic algorithm to compute the LCS. Through our computations, we found periodic words that are more random-like than a random word, as measured by the LCS.

Probability Combinatorics

Expected Number of Distinct Subsequences in Randomly Generated Binary Strings

53 - Yonah Biers-Ariel , Anant Godbole , Elizabeth Kelley 2017

When considering binary strings, its natural to wonder how many distinct subsequences might exist in a given string. Given that there is an existing algorithm which provides a straightforward way to compute the number of distinct subsequences in a fixed string, we might next be interested in the expected number of distinct subsequences in random strings. This expected value is already known for random binary strings where each letter in the string is, independently, equally likely to be a 1 or a 0. We generalize this result to random strings where the letter 1 appears independently with probability $alpha in [0,1]$. Also, we make some progress in the case of random strings from an arbitrary alphabet as well as when the string is generated by a two-state Markov chain.

Combinatorics

The Number of Distinct Subsequences of a Random Binary String

277 - Michael J. Collins 2013

We determine the average number of distinct subsequences in a random binary string, and derive an estimate for the average number of distinct subsequences of a particular length.

Combinatorics

On the number of words with restrictions on the number of symbols

223 - Veronica Becher , Eda Cesaratto 2021

We show that, in an alphabet of $n$ symbols, the number of words of length $n$ whose number of different symbols is away from $(1-1/e)n$, which is the value expected by the Poisson distribution, has exponential decay in $n$. We use Laplaces method for sums and known bounds of Stirling numbers of the second kind. We express our result in terms of inequalities.

Combinatorics

Longest common subsequences between words of very unequal length

124 - Boris Bukh , Zichao Dong 2020

We consider the expected length of the longest common subsequence between two random words of lengths $n$ and $(1-varepsilon)kn$ over $k$-symbol alphabet. It is well-known that this quantity is asymptotic to $gamma_{k,varepsilon} n$ for some constant $gamma_{k,varepsilon}$. We show that $gamma_{k,varepsilon}$ is of the order $1-cvarepsilon^2$ uniformly in $k$ and $varepsilon$. In addition, for large $k$, we give evidence that $gamma_{k,varepsilon}$ approaches $1-tfrac{1}{4}varepsilon^2$, and prove a matching lower bound.

Probability Combinatorics