No Arabic abstract
In this article, we discuss the composite likelihood estimation of sparse Gaussian graphical models. When there are symmetry constraints on the concentration matrix or partial correlation matrix, the likelihood estimation can be computational intensive. The composite likelihood offers an alternative formulation of the objective function and yields consistent estimators. When a sparse model is considered, the penalized composite likelihood estimation can yield estimates satisfying both the symmetry and sparsity constraints and possess ORACLE property. Application of the proposed method is demonstrated through simulation studies and a network analysis of a biological data set.
We introduce the package GraphicalModelsMLE for computing the maximum likelihood estimator (MLE) of a Gaussian graphical model in the computer algebra system Macaulay2. The package allows to compute for the class of loopless mixed graphs. Additional functionality allows to explore the underlying algebraic structure of the model, such as its ML degree and the ideal of score equations.
We propose a Bayesian approximate inference method for learning the dependence structure of a Gaussian graphical model. Using pseudo-likelihood, we derive an analytical expression to approximate the marginal likelihood for an arbitrary graph structure without invoking any assumptions about decomposability. The majority of the existing methods for learning Gaussian graphical models are either restricted to decomposable graphs or require specification of a tuning parameter that may have a substantial impact on learned structures. By combining a simple sparsity inducing prior for the graph structures with a default reference prior for the model parameters, we obtain a fast and easily applicable scoring function that works well for even high-dimensional data. We demonstrate the favourable performance of our approach by large-scale comparisons against the leading methods for learning non-decomposable Gaussian graphical models. A theoretical justification for our method is provided by showing that it yields a consistent estimator of the graph structure.
We consider an equivariant approach imposing data-driven bounds for the variances to avoid singular and spurious solutions in maximum likelihood (ML) estimation of clusterwise linear regression models. We investigate its use in the choice of the number of components and we propose a computational shortcut, which significantly reduces the computational time needed to tune the bounds on the data. In the simulation study and the two real-data applications, we show that the proposed methods guarantee a reliable assessment of the number of components compared to standard unconstrained methods, together with accurate model parameters estimation and cluster recovery.
We discuss an efficient implementation of the iterative proportional scaling procedure in the multivariate Gaussian graphical models. We show that the computational cost can be reduced by localization of the update procedure in each iterative step by using the structure of a decomposable model obtained by triangulation of the graph associated with the model. Some numerical experiments demonstrate the competitive performance of the proposed algorithm.
In order to learn the complex features of large spatio-temporal data, models with large parameter sets are often required. However, estimating a large number of parameters is often infeasible due to the computational and memory costs of maximum likelihood estimation (MLE). We introduce the class of marginally parametrized (MP) models, where inference can be performed efficiently with a sequence of marginal (estimated) likelihood functions via stepwise maximum likelihood estimation (SMLE). We provide the conditions under which the stepwise estimators are consistent, and we prove that this class of models includes the diagonal vector autoregressive moving average model. We demonstrate that the parameters of this model can be obtained at least three orders of magnitude faster using SMLE compared to MLE, with only a small loss in statistical efficiency. We apply an MP model to a spatio-temporal global climate data set (in order to learn complex features of interest to climate scientists) consisting of over five million data points, and we demonstrate how estimation can be performed in less than an hour on a laptop.