Do you want to publish a course? Click here

On Semi-parametric Bernstein-von Mises Theorems for BART

159   0   0.0 ( 0 )
 Added by Veronika Rockova
 Publication date 2019
and research's language is English




Ask ChatGPT about the research

Few methods in Bayesian non-parametric statistics/ machine learning have received as much attention as Bayesian Additive Regression Trees (BART). While BART is now routinely performed for prediction tasks, its theoretical properties began to be understood only very recently. In this work, we continue the theoretical investigation of BART initiated by Rockova and van der Pas (2017). In particular, we study the Bernstein-von Mises (BvM) phenomenon (i.e. asymptotic normality) for smooth linear functionals of the regression surface within the framework of non-parametric regression with fixed covariates. As with other adaptive priors, the BvM phenomenon may fail when the regularities of the functional and the truth are not compatible. To overcome the curse of adaptivity under hierarchical priors, we induce a self-similarity assumption to ensure convergence towards a single Gaussian distribution as opposed to a Gaussian mixture. Similar qualitative restrictions on the functional parameter are known to be necessary for adaptive inference. Many machine learning methods lack coherent probabilistic mechanisms for gauging uncertainty. BART readily provides such quantification via posterior credible sets. The BvM theorem implies that the credible sets are also confidence regions with the same asymptotic coverage. This paper presents the first asymptotic normality result for BART priors, providing another piece of evidence that BART is a valid tool from a frequentist point of view.



rate research

Read More

In this paper, we study the asymptotic posterior distribution of linear functionals of the density. In particular, we give general conditions to obtain a semiparametric version of the Bernstein-Von Mises theorem. We then apply this general result to nonparametric priors based on infinite dimensional exponential families. As a byproduct, we also derive adaptive nonparametric rates of concentration of the posterior distributions under these families of priors on the class of Sobolev and Besov spaces.
98 - Yulong Lu 2017
We prove a Bernstein-von Mises theorem for a general class of high dimensional nonlinear Bayesian inverse problems in the vanishing noise limit. We propose a sufficient condition on the growth rate of the number of unknown parameters under which the posterior distribution is asymptotically normal. This growth condition is expressed explicitly in terms of the model dimension, the degree of ill-posedness of the inverse problem and the noise parameter. The theoretical results are applied to a Bayesian estimation of the medium parameter in an elliptic problem.
The prominent Bernstein -- von Mises (BvM) result claims that the posterior distribution after centering by the efficient estimator and standardizing by the square root of the total Fisher information is nearly standard normal. In particular, the prior completely washes out from the asymptotic posterior distribution. This fact is fundamental and justifies the Bayes approach from the frequentist viewpoint. In the nonparametric setup the situation changes dramatically and the impact of prior becomes essential even for the contraction of the posterior; see [vdV2008], [Bo2011], [CaNi2013,CaNi2014] for different models like Gaussian regression or i.i.d. model in different weak topologies. This paper offers another non-asymptotic approach to studying the behavior of the posterior for a special but rather popular and useful class of statistical models and for Gaussian priors. First we derive tight finite sample bounds on posterior contraction in terms of the so called effective dimension of the parameter space. Our main results describe the accuracy of Gaussian approximation of the posterior. In particular, we show that restricting to the class of all centrally symmetric credible sets around pMLE allows to get Gaussian approximation up to order (n^{-1}). We also show that the posterior distribution mimics well the distribution of the penalized maximum likelihood estimator (pMLE) and reduce the question of reliability of credible sets to consistency of the pMLE-based confidence sets. The obtained results are specified for nonparametric log-density estimation and generalized regression.
Deheuvels [J. Multivariate Anal. 11 (1981) 102--113] and Genest and R{e}millard [Test 13 (2004) 335--369] have shown that powerful rank tests of multivariate independence can be based on combinations of asymptotically independent Cram{e}r--von Mises statistics derived from a M{o}bius decomposition of the empirical copula process. A result on the large-sample behavior of this process under contiguous sequences of alternatives is used here to give a representation of the limiting distribution of such test statistics and to compute their relative local asymptotic efficiency. Local power curves and asymptotic relative efficiencies are compared under familiar classes of copula alternatives.
We consider the problem of statistical inference for the effective dynamics of multiscale diffusion processes with (at least) two widely separated characteristic time scales. More precisely, we seek to determine parameters in the effective equation describing the dynamics on the longer diffusive time scale, i.e. in a homogenization framework. We examine the case where both the drift and the diffusion coefficients in the effective dynamics are space-dependent and depend on multiple unknown parameters. It is known that classical estimators, such as Maximum Likelihood and Quadratic Variation of the Path Estimators, fail to obtain reasonable estimates for parameters in the effective dynamics when based on observations of the underlying multiscale diffusion. We propose a novel algorithm for estimating both the drift and diffusion coefficients in the effective dynamics based on a semi-parametric framework. We demonstrate by means of extensive numerical simulations of a number of selected examples that the algorithm performs well when applied to data from a multiscale diffusion. These examples also illustrate that the algorithm can be used effectively to obtain accurate and unbiased estimates.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا