No Arabic abstract
We study the well-known problem of estimating a sparse $n$-dimensional unknown mean vector $theta = (theta_1, ..., theta_n)$ with entries corrupted by Gaussian white noise. In the Bayesian framework, continuous shrinkage priors which can be expressed as scale-mixture normal densities are popular for obtaining sparse estimates of $theta$. In this article, we introduce a new fully Bayesian scale-mixture prior known as the inverse gamma-gamma (IGG) prior. We prove that the posterior distribution contracts around the true $theta$ at (near) minimax rate under very mild conditions. In the process, we prove that the sufficient conditions for minimax posterior contraction given by Van der Pas et al. (2016) are not necessary for optimal posterior contraction. We further show that the IGG posterior density concentrates at a rate faster than those of the horseshoe or the horseshoe+ in the Kullback-Leibler (K-L) sense. To classify true signals ($theta_i eq 0$), we also propose a hypothesis test based on thresholding the posterior mean. Taking the loss function to be the expected number of misclassified tests, we show that our test procedure asymptotically attains the optimal Bayes risk exactly. We illustrate through simulations and data analysis that the IGG has excellent finite sample performance for both estimation and classification.
We show that diffusion processes can be exploited to study the posterior contraction rates of parameters in Bayesian models. By treating the posterior distribution as a stationary distribution of a stochastic differential equation (SDE), posterior convergence rates can be established via control of the moments of the corresponding SDE. Our results depend on the structure of the population log-likelihood function, obtained in the limit of an infinite sample sample size, and stochastic perturbation bounds between the population and sample log-likelihood functions. When the population log-likelihood is strongly concave, we establish posterior convergence of a $d$-dimensional parameter at the optimal rate $(d/n)^{1/ 2}$. In the weakly concave setting, we show that the convergence rate is determined by the unique solution of a non-linear equation that arises from the interplay between the degree of weak concavity and the stochastic perturbation bounds. We illustrate this general theory by deriving posterior convergence rates for three concrete examples: Bayesian logistic regression models, Bayesian single index models, and over-specified Bayesian mixture models.
We discuss a general approach to handling multiple hypotheses testing in the case when a particular hypothesis states that the vector of parameters identifying the distribution of observations belongs to a convex compact set associated with the hypothesis. With our approach, this problem reduces to testing the hypotheses pairwise. Our central result is a test for a pair of hypotheses of the outlined type which, under appropriate assumptions, is provably nearly optimal. The test is yielded by a solution to a convex programming problem, so that our construction admits computationally efficient implementation. We further demonstrate that our assumptions are satisfied in several important and interesting applications. Finally, we show how our approach can be applied to a rather general detection problem encompassing several classical statistical settings such as detection of abrupt signal changes, cusp detection and multi-sensor detection.
In this paper, we prove almost surely consistency of a Survival Analysis model, which puts a Gaussian process, mapped to the unit interval, as a prior on the so-called hazard function. We assume our data is given by survival lifetimes $T$ belonging to $mathbb{R}^{+}$, and covariates on $[0,1]^d$, where $d$ is an arbitrary dimension. We define an appropriate metric for survival functions and prove posterior consistency with respect to this metric. Our proof is based on an extension of the theorem of Schwartz (1965), which gives general conditions for proving almost surely consistency in the setting of non i.i.d random variables. Due to the nature of our data, several results for Gaussian processes on $mathbb{R}^+$ are proved which may be of independent interest.
The logistic regression model is the most popular model for analyzing binary data. In the absence of any prior information, an improper flat prior is often used for the regression coefficients in Bayesian logistic regression models. The resulting intractable posterior density can be explored by running Polson et al.s (2013) data augmentation (DA) algorithm. In this paper, we establish that the Markov chain underlying Polson et al.s (2013) DA algorithm is geometrically ergodic. Proving this theoretical result is practically important as it ensures the existence of central limit theorems (CLTs) for sample averages under a finite second moment condition. The CLT in turn allows users of the DA algorithm to calculate standard errors for posterior estimates.
Consider a Poisson point process with unknown support boundary curve $g$, which forms a prototype of an irregular statistical model. We address the problem of estimating non-linear functionals of the form $int Phi(g(x)),dx$. Following a nonparametric maximum-likelihood approach, we construct an estimator which is UMVU over Holder balls and achieves the (local) minimax rate of convergence. These results hold under weak assumptions on $Phi$ which are satisfied for $Phi(u)=|u|^p$, $pge 1$. As an application, we consider the problem of estimating the $L^p$-norm and derive the minimax separation rates in the corresponding nonparametric hypothesis testing problem. Structural differences to results for regular nonparametric models are discussed.