Generalized autoregressive moving average (GARMA) models are a class of models that was developed for extending the univariate Gaussian ARMA time series model to a flexible observation-driven model for non-Gaussian time series data. This work presents Bayesian approach for GARMA models with Poisson, binomial and negative binomial distributions. A simulation study was carried out to investigate the performance of Bayesian estimation and Bayesian model selection criteria. Also three real datasets were analysed using the Bayesian approach on GARMA models.
Transformed Generalized Autoregressive Moving Average (TGARMA) models were recently proposed to deal with non-additivity, non-normality and heteroscedasticity in real time series data. In this paper, a Bayesian approach is proposed for TGARMA models, thus extending the original model. We conducted a simulation study to investigate the performance of Bayesian estimation and Bayesian model selection criteria. In addition, a real dataset was analysed using the proposed approach.
Microorganisms play critical roles in human health and disease. It is well known that microbes live in diverse communities in which they interact synergistically or antagonistically. Thus for estimating microbial associations with clinical covariates, multivariate statistical models are preferred. Multivariate models allow one to estimate and exploit complex interdependencies among multiple taxa, yielding more powerful tests of exposure or treatment effects than application of taxon-specific univariate analyses. In addition, the analysis of microbial count data requires special attention because data commonly exhibit zero inflation. To meet these needs, we developed a Bayesian variable selection model for multivariate count data with excess zeros that incorporates information on the covariance structure of the outcomes (counts for multiple taxa), while estimating associations with the mean levels of these outcomes. Although there has been a great deal of effort in zero-inflated models for longitudinal data, little attention has been given to high-dimensional multivariate zero-inflated data modeled via a general correlation structure. Through simulation, we compared performance of the proposed method to that of existing univariate approaches, for both the binary and count parts of the model. When outcomes were correlated the proposed variable selection method maintained type I error while boosting the ability to identify true associations in the binary component of the model. For the count part of the model, in some scenarios the the univariate method had higher power than the multivariate approach. This higher power was at a cost of a highly inflated false discovery rate not observed with the proposed multivariate method. We applied the approach to oral microbiome data from the Pediatric HIV/AIDS Cohort Oral Health Study and identified five species (of 44) associated with HIV infection.
We propose a flexible model for count time series which has potential uses for both underdispersed and overdispersed data. The model is based on the Conway-Maxwell-Poisson (COM-Poisson) distribution with parameters varying along time to take serial correlation into account. Model estimation is challenging however and require the application of recently proposed methods to deal with the intractable normalising constant as well as efficiently sampling values from the COM-Poisson distribution.
Factor analysis is a flexible technique for assessment of multivariate dependence and codependence. Besides being an exploratory tool used to reduce the dimensionality of multivariate data, it allows estimation of common factors that often have an interesting theoretical interpretation in real problems. However, standard factor analysis is only applicable when the variables are scaled, which is often inappropriate, for example, in data obtained from questionnaires in the field of psychology,where the variables are often categorical. In this framework, we propose a factor model for the analysis of multivariate ordered and non-ordered polychotomous data. The inference procedure is done under the Bayesian approach via Markov chain Monte Carlo methods. Two Monte-Carlo simulation studies are presented to investigate the performance of this approach in terms of estimation bias, precision and assessment of the number of factors. We also illustrate the proposed method to analyze participants responses to the Motivational State Questionnaire dataset, developed to study emotions in laboratory and field settings.
Environmental processes resolved at a sufficiently small scale in space and time will inevitably display non-stationary behavior. Such processes are both challenging to model and computationally expensive when the data size is large. Instead of modeling the global non-stationarity explicitly, local models can be applied to disjoint regions of the domain. The choice of the size of these regions is dictated by a bias-variance trade-off; large regions will have smaller variance and larger bias, whereas small regions will have higher variance and smaller bias. From both the modeling and computational point of view, small regions are preferable to better accommodate the non-stationarity. However, in practice, large regions are necessary to control the variance. We propose a novel Bayesian three-step approach that allows for smaller regions without compromising the increase of the variance that would follow. We are able to propagate the uncertainty from one step to the next without issues caused by reusing the data. The improvement in inference also results in improved prediction, as our simulated example shows. We illustrate this new approach on a data set of simulated high-resolution wind speed data over Saudi Arabia.