Generalized Spacing-Statistics and a New Family of Non-Parametric Tests


Abstract in English

Random divisions of an interval arise in various context, including statistics, physics, and geometric analysis. For testing the uniformity of a random partition of the unit interval $[0,1]$ into $k$ disjoint subintervals of size $(S_k[1],ldots,S_k[k])$, Greenwood (1946) suggested using the squared $ell_2$-norm of this size vector as a test statistic, prompting a number of subsequent studies. Despite much progress on understanding its power and asymptotic properties, attempts to find its exact distribution have succeeded so far for only small values of $k$. Here, we develop an efficient method to compute the distribution of the Greenwood statistic and more general spacing-statistics for an arbitrary value of $k$. Specifically, we consider random divisions of ${1,2,dots,n}$ into $k$ subsets of consecutive integers and study $|S_{n,k}|^p_{p,w}$, the $p$th power of the weighted $ell_p$-norm of the subset size vector $S_{n,k}=(S_{n,k}[1],ldots,S_{n,k}[k])$ for arbitrary weights $w=(w_1,ldots,w_k)$. We present an exact and quickly computable formula for its moments, as well as a simple algorithm to accurately reconstruct a probability distribution using the moment sequence. We also study various scaling limits, one of which corresponds to the Greenwood statistic in the case of $p=2$ and $w=(1,ldots,1)$, and this connection allows us to obtain information about regularity, monotonicity and local behavior of its distribution. Lastly, we devise a new family of non-parametric tests using $|S_{n,k}|^p_{p,w}$ and demonstrate that they exhibit substantially improved power for a large class of alternatives, compared to existing popular methods such as the Kolmogorov-Smirnov, Cramer-von Mises, and Mann-Whitney/Wilcoxon rank-sum tests.

Download