ترغب بنشر مسار تعليمي؟ اضغط هنا

Fast Exact Bayesian Inference for Sparse Signals in the Normal Sequence Model

116   0   0.0 ( 0 )
 نشر من قبل Tim van Erven
 تاريخ النشر 2018
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider exact algorithms for Bayesian inference with model selection priors (including spike-and-slab priors) in the sparse normal sequence model. Because the best existing exact algorithm becomes numerically unstable for sample sizes over n=500, there has been much attention for alternative approaches like approximate algorithms (Gibbs sampling, variational Bayes, etc.), shrinkage priors (e.g. the Horseshoe prior and the Spike-and-Slab LASSO) or empirical Bayesian methods. However, by introducing algorithmic ideas from online sequential prediction, we show that exact calculations are feasible for much larger sample sizes: for general model selection priors we reach n=25000, and for certain spike-and-slab priors we can easily reach n=100000. We further prove a de Finetti-like result for finite sample sizes that characterizes exactly which model selection priors can be expressed as spike-and-slab priors. The computational speed and numerical accuracy of the proposed methods are demonstrated in experiments on simulated data, on a differential gene expression data set, and to compare the effect of multiple hyper-parameter settings in the beta-binomial prior. In our experimental evaluation we compute guaranteed bounds on the numerical accuracy of all new algorithms, which shows that the proposed methods are numerically reliable whereas an alternative based on long division is not.



قيم البحث

اقرأ أيضاً

Aggregating multiple effects is often encountered in large-scale data analysis where the fraction of significant effects is generally small. Many existing methods cannot handle it effectively because of lack of computational accuracy for small p-valu es. The Cauchy combination test (abbreviated as CCT) ( J Am Statist Assoc, 2020, 115(529):393-402) is a powerful and computational effective test to aggregate individual $p$-values under arbitrary correlation structures. This work revisits CCT and shows three key contributions including that (i) the tail probability of CCT can be well approximated by a standard Cauchy distribution under much more relaxed conditions placed on individual p-values instead of the original test statistics; (ii) the relaxation conditions are shown to be satisfied for many popular copulas formulating bivariate distributions; (iii) the power of CCT is no less than that of the minimum-type test as the number of tests goes to infinity with some regular conditions. These results further broaden the theories and applications of CCT. The simulation results verify the theoretic results and the performance of CCT is further evaluated with data from a prostate cancer study.
Bayesian nonparametric priors based on completely random measures (CRMs) offer a flexible modeling approach when the number of latent components in a dataset is unknown. However, managing the infinite dimensionality of CRMs typically requires practit ioners to derive ad-hoc algorithms, preventing the use of general-purpose inference methods and often leading to long compute times. We propose a general but explicit recipe to construct a simple finite-dimensional approximation that can replace the infinite-dimensional CRMs. Our independent finite approximation (IFA) is a generalization of important cases that are used in practice. The independence of atom weights in our approximation (i) makes the construction well-suited for parallel and distributed computation and (ii) facilitates more convenient inference schemes. We quantify the approximation error between IFAs and the target nonparametric prior. We compare IFAs with an alternative approximation scheme -- truncated finite approximations (TFAs), where the atom weights are constructed sequentially. We prove that, for worst-case choices of observation likelihoods, TFAs are a more efficient approximation than IFAs. However, in real-data experiments with image denoising and topic modeling, we find that IFAs perform very similarly to TFAs in terms of task-specific accuracy metrics.
This paper investigates the high-dimensional linear regression with highly correlated covariates. In this setup, the traditional sparsity assumption on the regression coefficients often fails to hold, and consequently many model selection procedures do not work. To address this challenge, we model the variations of covariates by a factor structure. Specifically, strong correlations among covariates are explained by common factors and the remaining variations are interpreted as idiosyncratic components of each covariate. This leads to a factor-adjusted regression model with both common factors and idiosyncratic components as covariates. We generalize the traditional sparsity assumption accordingly and assume that all common factors but only a small number of idiosyncratic components contribute to the response. A Bayesian procedure with a spike-and-slab prior is then proposed for parameter estimation and model selection. Simulation studies show that our Bayesian method outperforms its lasso analogue, manifests insensitivity to the overestimates of the number of common factors, pays a negligible price in the no correlation case, and scales up well with increasing sample size, dimensionality and sparsity. Numerical results on a real dataset of U.S. bond risk premia and macroeconomic indicators lend strong support to our methodology.
217 - Umberto Picchini 2012
Models defined by stochastic differential equations (SDEs) allow for the representation of random variability in dynamical systems. The relevance of this class of models is growing in many applied research areas and is already a standard tool to mode l e.g. financial, neuronal and population growth dynamics. However inference for multidimensional SDE models is still very challenging, both computationally and theoretically. Approximate Bayesian computation (ABC) allow to perform Bayesian inference for models which are sufficiently complex that the likelihood function is either analytically unavailable or computationally prohibitive to evaluate. A computationally efficient ABC-MCMC algorithm is proposed, halving the running time in our simulations. Focus is on the case where the SDE describes latent dynamics in state-space models; however the methodology is not limited to the state-space framework. Simulation studies for a pharmacokinetics/pharmacodynamics model and for stochastic chemical reactions are considered and a MATLAB package implementing our ABC-MCMC algorithm is provided.
We propose a framework for Bayesian non-parametric estimation of the rate at which new infections occur assuming that the epidemic is partially observed. The developed methodology relies on modelling the rate at which new infections occur as a functi on which only depends on time. Two different types of prior distributions are proposed namely using step-functions and B-splines. The methodology is illustrated using both simulated and real datasets and we show that certain aspects of the epidemic such as seasonality and super-spreading events are picked up without having to explicitly incorporate them into a parametric model.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا