ﻻ يوجد ملخص باللغة العربية
Given a random text over a finite alphabet, we study the frequencies at which fixed-length words occur as subsequences. As the data size grows, the joint distribution of word counts exhibits a rich asymptotic structure. We investigate all linear combinations of subword statistics, and fully characterize their different orders of magnitude using diverse algebraic tools. Moreover, we establish the spectral decomposition of the space of word statistics of each order. We provide explicit formulas for the eigenvectors and eigenvalues of the covariance matrix of the multivariate distribution of these statistics. Our techniques include and elaborate on a set of algebraic word operators, recently studied and employed by Dieker and Saliola (Adv Math, 2018). Subword counts find applications in Combinatorics, Statistics, and Computer Science. We revisit special cases from the combinatorial literature, such as intransitive dice, random core partitions, and questions on random walk. Our structural approach describes in a unified framework several classical statistical tests. We propose further potential applications to data analysis and machine learning.
We obtain analytical approximations for the expectation and variance of the Spectral Kurtosis estimator in the case of Gaussian and coherent transient time domain signals mixed with a quasi-stationary Gaussian background, which are suitable for pract
We study the spectrum of a random multigraph with a degree sequence ${bf D}_n=(D_i)_{i=1}^n$ and average degree $1 ll omega_n ll n$, generated by the configuration model, and also the spectrum of the analogous random simple graph. We show that, when
RNA motifs typically consist of short, modular patterns that include base pairs formed within and between modules. Estimating the abundance of these patterns is of fundamental importance for assessing the statistical significance of matches in genome
We prove a non-asymptotic concentration inequality for the spectral norm of sparse inhomogeneous random tensors with Bernoulli entries. For an order-$k$ inhomogeneous random tensor $T$ with sparsity $p_{max}geq frac{clog n}{n }$, we show that $|T-mat
The hardcore model on a graph $G$ with parameter $lambda>0$ is a probability measure on the collection of all independent sets of $G$, that assigns to each independent set $I$ a probability proportional to $lambda^{|I|}$. In this paper we consider th