No Arabic abstract
Statistical methods for functional data are of interest for many applications. In this paper, we prove a central limit theorem for random variables taking their values in a Hilbert space. The random variables are assumed to be weakly dependent in the sense of near epoch dependence, where the underlying process fulfills some mixing conditions. As parametric inference in an infinite dimensional space is difficult, we show that the nonoverlapping block bootstrap is consistent. Furthermore, we show how these results can be used for degenerate von Mises-statistics.
We establish exponential inequalities for a class of V-statistics under strong mixing conditions. Our theory is developed via a novel kernel expansion based on random Fourier features and the use of a probabilistic method. This type of expansion is new and useful for handling many notorious classes of kernels.
The infinite-dimensional Hilbert sphere $S^infty$ has been widely employed to model density functions and shapes, extending the finite-dimensional counterpart. We consider the Frechet mean as an intrinsic summary of the central tendency of data lying on $S^infty$. To break a path for sound statistical inference, we derive properties of the Frechet mean on $S^infty$ by establishing its existence and uniqueness as well as a root-$n$ central limit theorem (CLT) for the sample version, overcoming obstructions from infinite-dimensionality and lack of compactness on $S^infty$. Intrinsic CLTs for the estimated tangent vectors and covariance operator are also obtained. Asymptotic and bootstrap hypothesis tests for the Frechet mean based on projection and norm are then proposed and are shown to be consistent. The proposed two-sample tests are applied to make inference for daily taxi demand patterns over Manhattan modeled as densities, of which the square roots are analyzed on the Hilbert sphere. Numerical properties of the proposed hypothesis tests which utilize the spherical geometry are studied in the real data application and simulations, where we demonstrate that the tests based on the intrinsic geometry compare favorably to those based on an extrinsic or flat geometry.
Few methods in Bayesian non-parametric statistics/ machine learning have received as much attention as Bayesian Additive Regression Trees (BART). While BART is now routinely performed for prediction tasks, its theoretical properties began to be understood only very recently. In this work, we continue the theoretical investigation of BART initiated by Rockova and van der Pas (2017). In particular, we study the Bernstein-von Mises (BvM) phenomenon (i.e. asymptotic normality) for smooth linear functionals of the regression surface within the framework of non-parametric regression with fixed covariates. As with other adaptive priors, the BvM phenomenon may fail when the regularities of the functional and the truth are not compatible. To overcome the curse of adaptivity under hierarchical priors, we induce a self-similarity assumption to ensure convergence towards a single Gaussian distribution as opposed to a Gaussian mixture. Similar qualitative restrictions on the functional parameter are known to be necessary for adaptive inference. Many machine learning methods lack coherent probabilistic mechanisms for gauging uncertainty. BART readily provides such quantification via posterior credible sets. The BvM theorem implies that the credible sets are also confidence regions with the same asymptotic coverage. This paper presents the first asymptotic normality result for BART priors, providing another piece of evidence that BART is a valid tool from a frequentist point of view.
This paper has been temporarily withdrawn, pending a revised version taking into account similarities between this paper and the recent work of del Barrio, Gine and Utzet (Bernoulli, 11 (1), 2005, 131-189).
In this paper, we study the asymptotic posterior distribution of linear functionals of the density. In particular, we give general conditions to obtain a semiparametric version of the Bernstein-Von Mises theorem. We then apply this general result to nonparametric priors based on infinite dimensional exponential families. As a byproduct, we also derive adaptive nonparametric rates of concentration of the posterior distributions under these families of priors on the class of Sobolev and Besov spaces.