No Arabic abstract
This paper introduces a Laplace approximation to Bayesian inference in regression models for multivariate response variables. We focus on Dirichlet regression models, which can be used to analyze a set of variables on a simplex exhibiting skewness and heteroscedasticity, without having to transform the data. These data, which mainly consist of proportions or percentages of disjoint categories, are widely known as compositional data and are common in areas such as ecology, geology, and psychology. We provide both the theoretical foundations and a description of how this Laplace approximation can be implemented in the case of Dirichlet regression. The paper also introduces the package dirinla in the R-language that extends the INLA package, which can not deal directly with multivariate likelihoods like the Dirichlet likelihood. Simulation studies are presented to validate the good behaviour of the proposed method, while a real data case-study is used to show how this approach can be applied.
There is an increasing interest in estimating expectations outside of the classical inference framework, such as for models expressed as probabilistic programs. Many of these contexts call for some form of nested inference to be applied. In this paper, we analyse the behaviour of nested Monte Carlo (NMC) schemes, for which classical convergence proofs are insufficient. We give conditions under which NMC will converge, establish a rate of convergence, and provide empirical data that suggests that this rate is observable in practice. Finally, we prove that general-purpose nested inference schemes are inherently biased. Our results serve to warn of the dangers associated with naive composition of inference and models.
Prediction rule ensembles (PREs) are sparse collections of rules, offering highly interpretable regression and classification models. This paper presents the R package pre, which derives PREs through the methodology of Friedman and Popescu (2008). The implementation and functionality of package pre is described and illustrated through application on a dataset on the prediction of depression. Furthermore, accuracy and sparsity of PREs is compared with that of single trees, random forest and lasso regression in four benchmark datasets. Results indicate that pre derives ensembles with predictive accuracy comparable to that of random forests, while using a smaller number of variables for prediction.
We develop a fully Bayesian nonparametric regression model based on a Levy process prior named MLABS (Multivariate Levy Adaptive B-Spline regression) model, a multivariate version of the LARK (Levy Adaptive Regression Kernels) models, for estimating unknown functions with either varying degrees of smoothness or high interaction orders. Levy process priors have advantages of encouraging sparsity in the expansions and providing automatic selection over the number of basis functions. The unknown regression function is expressed as a weighted sum of tensor product of B-spline basis functions as the elements of an overcomplete system, which can deal with multi-dimensional data. The B-spline basis can express systematically functions with varying degrees of smoothness. By changing a set of degrees of the tensor product basis function, MLABS can adapt the smoothness of target functions due to the nice properties of B-spline bases. The local support of the B-spline basis enables the MLABS to make more delicate predictions than other existing methods in the two-dimensional surface data. Experiments on various simulated and real-world datasets illustrate that the MLABS model has comparable performance on regression and classification problems. We also show that the MLABS model has more stable and accurate predictive abilities than state-of-the-art nonparametric regression models in relatively low-dimensional data.
Gaussian latent variable models are a key class of Bayesian hierarchical models with applications in many fields. Performing Bayesian inference on such models can be challenging as Markov chain Monte Carlo algorithms struggle with the geometry of the resulting posterior distribution and can be prohibitively slow. An alternative is to use a Laplace approximation to marginalize out the latent Gaussian variables and then integrate out the remaining hyperparameters using dynamic Hamiltonian Monte Carlo, a gradient-based Markov chain Monte Carlo sampler. To implement this scheme efficiently, we derive a novel adjoint method that propagates the minimal information needed to construct the gradient of the approximate marginal likelihood. This strategy yields a scalable differentiation method that is orders of magnitude faster than state of the art differentiation techniques when the hyperparameters are high dimensional. We prototype the method in the probabilistic programming framework Stan and test the utility of the embedded Laplace approximation on several models, including one where the dimension of the hyperparameter is $sim$6,000. Depending on the cases, the benefits can include an alleviation of the geometric pathologies that frustrate Hamiltonian Monte Carlo and a dramatic speed-up.
We consider Particle Gibbs (PG) as a tool for Bayesian analysis of non-linear non-Gaussian state-space models. PG is a Monte Carlo (MC) approximation of the standard Gibbs procedure which uses sequential MC (SMC) importance sampling inside the Gibbs procedure to update the latent and potentially high-dimensional state trajectories. We propose to combine PG with a generic and easily implementable SMC approach known as Particle Efficient Importance Sampling (PEIS). By using SMC importance sampling densities which are approximately fully globally adapted to the targeted density of the states, PEIS can substantially improve the mixing and the efficiency of the PG draws from the posterior of the states and the parameters relative to existing PG implementations. The efficiency gains achieved by PEIS are illustrated in PG applications to a univariate stochastic volatility model for asset returns, a non-Gaussian nonlinear local-level model for interest rates, and a multivariate stochastic volatility model for the realized covariance matrix of asset returns.