No Arabic abstract
Many real-life applications involve estimation of curves that exhibit complicated shapes including jumps or varying-frequency oscillations. Practical methods have been devised that can adapt to a locally varying complexity of an unknown function (e.g. variable-knot splines, sparse wavelet reconstructions, kernel methods or trees/forests). However, the overwhelming majority of existing asymptotic minimaxity theory is predicated on homogeneous smoothness assumptions. Focusing on locally Holderian functions, we provide new locally adaptive posterior concentration rate results under the supremum loss for widely used Bayesian machine learning techniques in white noise and non-parametric regression. In particular, we show that popular spike-and-slab priors and Bayesian CART are uniformly locally adaptive. In addition, we propose a new class of repulsive partitioning priors which relate to variable knot splines and which are exact-rate adaptive. For uncertainty quantification, we construct locally adaptive confidence bands whose width depends on the local smoothness and which achieve uniform asymptotic coverage under local self-similarity. To illustrate that spatial adaptation is not at all automatic, we provide lower-bound results showing that popular hierarchical Gaussian process priors fall short of spatial adaptation.
Bayesian methods are developed for the multivariate nonparametric regression problem where the domain is taken to be a compact Riemannian manifold. In terms of the latter, the underlying geometry of the manifold induces certain symmetries on the multivariate nonparametric regression function. The Bayesian approach then allows one to incorporate hierarchical Bayesian methods directly into the spectral structure, thus providing a symmetry-adaptive multivariate Bayesian function estimator. One can also diffuse away some prior information in which the limiting case is a smoothing spline on the manifold. This, together with the result that the smoothing spline solution obtains the minimax rate of convergence in the multivariate nonparametric regression problem, provides good frequentist properties for the Bayes estimators. An application to astronomy is included.
Recently a blind source separation model was suggested for spatial data together with an estimator based on the simultaneous diagonalisation of two scatter matrices. The asymptotic properties of this estimator are derived here and a new estimator, based on the joint diagonalisation of more than two scatter matrices, is proposed. The asymptotic properties and merits of the novel estimator are verified in simulation studies. A real data example illustrates the method.
We study the problem of estimating a multivariate convex function defined on a convex body in a regression setting with random design. We are interested in optimal rates of convergence under a squared global continuous $l_2$ loss in the multivariate setting $(dgeq 2)$. One crucial fact is that the minimax risks depend heavily on the shape of the support of the regression function. It is shown that the global minimax risk is on the order of $n^{-2/(d+1)}$ when the support is sufficiently smooth, but that the rate $n^{-4/(d+4)}$ is when the support is a polytope. Such differences in rates are due to difficulties in estimating the regression function near the boundary of smooth regions. We then study the natural bounded least squares estimators (BLSE): we show that the BLSE nearly attains the optimal rates of convergence in low dimensions, while suffering rate-inefficiency in high dimensions. We show that the BLSE adapts nearly parametrically to polyhedral functions when the support is polyhedral in low dimensions by a local entropy method. We also show that the boundedness constraint cannot be dropped when risk is assessed via continuous $l_2$ loss. Given rate sub-optimality of the BLSE in higher dimensions, we further study rate-efficient adaptive estimation procedures. Two general model selection methods are developed to provide sieved adaptive estimators (SAE) that achieve nearly optimal rates of convergence for particular regular classes of convex functions, while maintaining nearly parametric rate-adaptivity to polyhedral functions in arbitrary dimensions. Interestingly, the uniform boundedness constraint is unnecessary when risks are measured in discrete $l_2$ norms.
This work affords new insights into Bayesian CART in the context of structured wavelet shrinkage. The main thrust is to develop a formal inferential framework for Bayesian tree-based regression. We reframe Bayesian CART as a g-type prior which departs from the typical wavelet product priors by harnessing correlation induced by the tree topology. The practically used Bayesian CART priors are shown to attain adaptive near rate-minimax posterior concentration in the supremum norm in regression models. For the fundamental goal of uncertainty quantification, we construct adaptive confidence bands for the regression function with uniform coverage under self-similarity. In addition, we show that tree-posteriors enable optimal inference in the form of efficient confidence sets for smooth functionals of the regression function.
Shrinkage prior are becoming more and more popular in Bayesian modeling for high dimensional sparse problems due to its computational efficiency. Recent works show that a polynomially decaying prior leads to satisfactory posterior asymptotics under regression models. In the literature, statisticians have investigated how the global shrinkage parameter, i.e., the scale parameter, in a heavy tail prior affects the posterior contraction. In this work, we explore how the shape of the prior, or more specifically, the polynomial order of the prior tail affects the posterior. We discover that, under the sparse normal means models, the polynomial order does affect the multiplicative constant of the posterior contraction rate. More importantly, if the polynomial order is sufficiently close to 1, it will induce the optimal Bayesian posterior convergence, in the sense that the Bayesian contraction rate is sharply minimax, i.e., not only the order, but also the multiplicative constant of the posterior contraction rate are optimal. The above Bayesian sharp minimaxity holds when the global shrinkage parameter follows a deterministic choice which depends on the unknown sparsity $s$. Therefore, a Beta-prior modeling is further proposed, such that our sharply minimax Bayesian procedure is adaptive to unknown $s$. Our theoretical discoveries are justified by simulation studies.