High-dimensional Gaussian sampling: a review and a unifying approach based on a stochastic proximal point algorithm

83 0 0.0 ( 0 )

Download Cite

Added by Maxime Vono

Publication date 2020

fields Mathematical Statistics

and research's language is English

Authors Maxime Vono - Nicolas Dobigeon - Pierre Chainais

Computation

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Efficient sampling from a high-dimensional Gaussian distribution is an old but high-stake issue. Vanilla Cholesky samplers imply a computational cost and memory requirements which can rapidly become prohibitive in high dimension. To tackle these issues, multiple methods have been proposed from different communities ranging from iterative numerical linear algebra to Markov chain Monte Carlo (MCMC) approaches. Surprisingly, no complete review and comparison of these methods have been conducted. This paper aims at reviewing all these approaches by pointing out their differences, close relations, benefits and limitations. In addition to this state of the art, this paper proposes a unifying Gaussian simulation framework by deriving a stochastic counterpart of the celebrated proximal point algorithm in optimization. This framework offers a novel and unifying revisit of most of the existing MCMC approaches while extending them. Guidelines to choose the appropriate Gaussian simulation method for a given sampling problem in high dimension are proposed and illustrated with numerical examples.

rate research

A Latent Slice Sampling Algorithm

114 - Yanxin Li , Stephen G. Walker 2020

In this paper we introduce a new sampling algorithm which has the potential to be adopted as a universal replacement to the Metropolis--Hastings algorithm. It is related to the slice sampler, and motivated by an algorithm which is applicable to discrete probability distributions %which can be viewed as an alternative to the Metropolis--Hastings algorithm in this setting, which obviates the need for a proposal distribution, in that is has no accept/reject component. This paper looks at the continuous counterpart. A latent variable combined with a slice sampler and a shrinkage procedure applied to uniform density functions creates a highly efficient sampler which can generate random variables from very high dimensional distributions as a single block.

Computation

A Proximal Stochastic Quasi-Newton Algorithm

124 - Luo Luo , Zihao Chen , Zhihua Zhang 2016

In this paper, we discuss the problem of minimizing the sum of two convex functions: a smooth function plus a non-smooth function. Further, the smooth part can be expressed by the average of a large number of smooth component functions, and the non-smooth part is equipped with a simple proximal mapping. We propose a proximal stochastic second-order method, which is efficient and scalable. It incorporates the Hessian in the smooth part of the function and exploits multistage scheme to reduce the variance of the stochastic gradient. We prove that our method can achieve linear rate of convergence.

Machine Learning Machine Learning

Fast sampling with Gaussian scale-mixture priors in high-dimensional regression

385 - Anirban Bhattacharya , Antik Chakraborty , Bani K. Mallick 2015

We propose an efficient way to sample from a class of structured multivariate Gaussian distributions which routinely arise as conditional posteriors of model parameters that are assigned a conditionally Gaussian prior. The proposed algorithm only requires matrix operations in the form of matrix multiplications and linear system solutions. We exhibit that the computational complexity of the proposed algorithm grows linearly with the dimension unlike existing algorithms relying on Cholesky factorizations with cubic orders of complexity. The algorithm should be broadly applicable in settings where Gaussian scale mixture priors are used on high dimensional model parameters. We provide an illustration through posterior sampling in a high dimensional regression setting with a horseshoe prior on the vector of regression coefficients.

Computation

A Semi-Smooth Newton Algorithm for High-Dimensional Nonconvex Sparse Learning

104 - Yueyong Shi , Jian Huang , Yuling Jiao 2018

The smoothly clipped absolute deviation (SCAD) and the minimax concave penalty (MCP) penalized regression models are two important and widely used nonconvex sparse learning tools that can handle variable selection and parameter estimation simultaneously, and thus have potential applications in various fields such as mining biological data in high-throughput biomedical studies. Theoretically, these two models enjoy the oracle property even in the high-dimensional settings, where the number of predictors $p$ may be much larger than the number of observations $n$. However, numerically, it is quite challenging to develop fast and stable algorithms due to their non-convexity and non-smoothness. In this paper we develop a fast algorithm for SCAD and MCP penalized learning problems. First, we show that the global minimizers of both models are roots of the nonsmooth equations. Then, a semi-smooth Newton (SSN) algorithm is employed to solve the equations. We prove that the SSN algorithm converges locally and superlinearly to the Karush-Kuhn-Tucker (KKT) points. Computational complexity analysis shows that the cost of the SSN algorithm per iteration is $O(np)$. Combined with the warm-start technique, the SSN algorithm can be very efficient and accurate. Simulation studies and a real data example suggest that our SSN algorithm, with comparable solution accuracy with the coordinate descent (CD) and the difference of convex (DC) proximal Newton algorithms, is more computationally efficient.

Computation Methodology

A simulated annealing procedure based on the ABC Shadow algorithm for statistical inference of point processes

117 - R. S. Stoica , M. Deaconu , L. Hurtado 2018

Recently a new algorithm for sampling posteriors of unnormalised probability densities, called ABC Shadow, was proposed in [8]. This talk introduces a global optimisation procedure based on the ABC Shadow simulation dynamics. First the general method is explained, and then results on simulated and real data are presented. The method is rather general, in the sense that it applies for probability densities that are continuously differentiable with respect to their parameters

Computation Statistics Theory Statistics Theory