No Arabic abstract
Random divisions of an interval arise in various context, including statistics, physics, and geometric analysis. For testing the uniformity of a random partition of the unit interval $[0,1]$ into $k$ disjoint subintervals of size $(S_k[1],ldots,S_k[k])$, Greenwood (1946) suggested using the squared $ell_2$-norm of this size vector as a test statistic, prompting a number of subsequent studies. Despite much progress on understanding its power and asymptotic properties, attempts to find its exact distribution have succeeded so far for only small values of $k$. Here, we develop an efficient method to compute the distribution of the Greenwood statistic and more general spacing-statistics for an arbitrary value of $k$. Specifically, we consider random divisions of ${1,2,dots,n}$ into $k$ subsets of consecutive integers and study $|S_{n,k}|^p_{p,w}$, the $p$th power of the weighted $ell_p$-norm of the subset size vector $S_{n,k}=(S_{n,k}[1],ldots,S_{n,k}[k])$ for arbitrary weights $w=(w_1,ldots,w_k)$. We present an exact and quickly computable formula for its moments, as well as a simple algorithm to accurately reconstruct a probability distribution using the moment sequence. We also study various scaling limits, one of which corresponds to the Greenwood statistic in the case of $p=2$ and $w=(1,ldots,1)$, and this connection allows us to obtain information about regularity, monotonicity and local behavior of its distribution. Lastly, we devise a new family of non-parametric tests using $|S_{n,k}|^p_{p,w}$ and demonstrate that they exhibit substantially improved power for a large class of alternatives, compared to existing popular methods such as the Kolmogorov-Smirnov, Cramer-von Mises, and Mann-Whitney/Wilcoxon rank-sum tests.
A reflexive generalized inverse and the Moore-Penrose inverse are often confused in statistical literature but in fact they have completely different behaviour in case the population covariance matrix is not a multiple of identity. In this paper, we study the spectral properties of a reflexive generalized inverse and of the Moore-Penrose inverse of the sample covariance matrix. The obtained results are used to assess the difference in the asymptotic behaviour of their eigenvalues.
We present a machine learning model for the analysis of randomly generated discrete signals, which we model as the points of a homogeneous or inhomogeneous, compound Poisson point process. Like the wavelet scattering transform introduced by S. Mallat, our construction is a mathematical model of convolutional neural networks and is naturally invariant to translations and reflections. Our model replaces wavelets with Gabor-type measurements and therefore decouples the roles of scale and frequency. We show that, with suitably chosen nonlinearities, our measurements distinguish Poisson point processes from common self-similar processes, and separate different types of Poisson point processes based on the first and second moments of the arrival intensity $lambda(t)$, as well as the absolute moments of the charges associated to each point.
This paper has been temporarily withdrawn, pending a revised version taking into account similarities between this paper and the recent work of del Barrio, Gine and Utzet (Bernoulli, 11 (1), 2005, 131-189).
We deal with a planar random flight ${(X(t),Y(t)),0<tleq T}$ observed at $n+1$ equidistant times $t_i=iDelta_n,i=0,1,...,n$. The aim of this paper is to estimate the unknown value of the parameter $lambda$, the underlying rate of the Poisson process. The planar random flights are not markovian, then we use an alternative argument to derive a pseudo-maximum likelihood estimator $hat{lambda}$ of the parameter $lambda$. We consider two different types of asymptotic schemes and show the consistency, the asymptotic normality and efficiency of the estimator proposed. A Monte Carlo analysis for small sample size $n$ permits us to analyze the empirical performance of $hat{lambda}$. A different approach permits us to introduce an alternative estimator of $lambda$ which is consistent, asymptotically normal and asymptotically efficient without the request of other assumptions.
We establish exponential inequalities for a class of V-statistics under strong mixing conditions. Our theory is developed via a novel kernel expansion based on random Fourier features and the use of a probabilistic method. This type of expansion is new and useful for handling many notorious classes of kernels.