Bayesian Knockoff Filter Using Gibbs Sampler

119 0 0.0 ( 0 )

Download Cite

Added by Jiaqi Gu

Publication date 2021

fields Mathematical Statistics

and research's language is English

Authors Jiaqi Gu - Guosheng Yin

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In many fields, researchers are interested in discovering features with substantial effect on the response from a large number of features and controlling the proportion of false discoveries. By incorporating the knockoff procedure in the Bayesian framework, we develop the Bayesian knockoff filter (BKF) for selecting features that have important effect on the response. In contrast to the fixed knockoff variables in the frequentist procedures, we allow the knockoff variables to be continuously updated in the Markov chain Monte Carlo. Based on the posterior samples and elaborated greedy selection procedures, our method can distinguish the truly important features as well as controlling the Bayesian false discovery rate at a desirable level. Numerical experiments on both synthetic and real data demonstrate the advantages of our method over existing knockoff methods and Bayesian variable selection approaches, i.e., the BKF possesses higher power and yields a lower false discovery rate.

rate research

Convergence Analysis of a Collapsed Gibbs Sampler for Bayesian Vector Autoregressions

252 - Karl Oskar Ekvall , Galin L. Jones 2019

We study the convergence properties of a collapsed Gibbs sampler for Bayesian vector autoregressions with predictors, or exogenous variables. The Markov chain generated by our algorithm is shown to be geometrically ergodic regardless of whether the number of observations in the underlying vector autoregression is small or large in comparison to the order and dimension of it. In a convergence complexity analysis, we also give conditions for when the geometric ergodicity is asymptotically stable as the number of observations tends to infinity. Specifically, the geometric convergence rate is shown to be bounded away from unity asymptotically, either almost surely or with probability tending to one, depending on what is assumed about the data generating process. This result is one of the first of its kind for practically relevant Markov chain Monte Carlo algorithms. Our convergence results hold under close to arbitrary model misspecification.

Statistics Theory Statistics Theory

The Coordinate Sampler: A Non-Reversible Gibbs-like MCMC Sampler

74 - Changye Wu , Christian P.n Robert (CEREMADE 2018

In this article, we derive a novel non-reversible, continuous-time Markov chain Monte Carlo (MCMC) sampler, called Coordinate Sampler, based on a piecewise deterministic Markov process (PDMP), which can be seen as a variant of the Zigzag sampler. In addition to proving a theoretical validation for this new sampling algorithm, we show that the Markov chain it induces exhibits geometrical ergodicity convergence, for distributions whose tails decay at least as fast as an exponential distribution and at most as fast as a Gaussian distribution. Several numerical examples highlight that our coordinate sampler is more efficient than the Zigzag sampler, in terms of effective sample size.

Computation

Geometric ergodicity of Polya-Gamma Gibbs sampler for Bayesian logistic regression with a flat prior

93 - Xin Wang , Vivekananda Roy 2018

The logistic regression model is the most popular model for analyzing binary data. In the absence of any prior information, an improper flat prior is often used for the regression coefficients in Bayesian logistic regression models. The resulting intractable posterior density can be explored by running Polson et al.s (2013) data augmentation (DA) algorithm. In this paper, we establish that the Markov chain underlying Polson et al.s (2013) DA algorithm is geometrically ergodic. Proving this theoretical result is practically important as it ensures the existence of central limit theorems (CLTs) for sample averages under a finite second moment condition. The CLT in turn allows users of the DA algorithm to calculate standard errors for posterior estimates.

Statistics Theory Statistics Theory

A CMB Gibbs sampler for localized secondary anisotropies

338 - Philip Bull , Ingunn K. Wehus , Hans Kristian Eriksen 2014

As well as primary fluctuations, CMB temperature maps contain a wealth of additional information in the form of secondary anisotropies. Secondary effects that can be identified with individual objects, such as the thermal and kinetic Sunyaev-Zeldovich (SZ) effects due to galaxy clusters, are difficult to unambiguously disentangle from foreground contamination and the primary CMB however. We develop a Bayesian formalism for rigorously characterising anisotropies that are localised on the sky, taking the TSZ and KSZ effects as an example. Using a Gibbs sampling scheme, we are able to efficiently sample from the joint posterior distribution for a multi-component model of the sky with many thousands of correlated physical parameters. The posterior can then be exactly marginalised to estimate properties of the secondary anisotropies, fully taking into account degeneracies with the other signals in the CMB map. We show that this method is computationally tractable using a simple implementation based on the existing Commander component separation code, and also discuss how other types of secondary anisotropy can be accommodated within our framework.

Cosmology and Nongalactic Astrophysics Instrumentation and Methods for Astrophysics

Adaptive Scan Gibbs Sampler for Large Scale Inference Problems

55 - Vadim Smolyakov , Qiang Liu , John W. Fisher III 2018

For large scale on-line inference problems the update strategy is critical for performance. We derive an adaptive scan Gibbs sampler that optimizes the update frequency by selecting an optimum mini-batch size. We demonstrate performance of our adaptive batch-size Gibbs sampler by comparing it against the collapsed Gibbs sampler for Bayesian Lasso, Dirichlet Process Mixture Models (DPMM) and Latent Dirichlet Allocation (LDA) graphical models.

Machine Learning