The Inverse G-Wishart Distribution and Variational Message Passing

77 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Matt Wand

تاريخ النشر 2020

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف L. Maestrini - M.P. Wand

التعلم الالي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Message passing on a factor graph is a powerful paradigm for the coding of approximate inference algorithms for arbitrarily graphical large models. The notion of a factor graph fragment allows for compartmentalization of algebra and computer code. We show that the Inverse G-Wishart family of distributions enables fundamental variational message passing factor graph fragments to be expressed elegantly and succinctly. Such fragments arise in models for which approximate inference concerning covariance matrix or variance parameters is made, and are ubiquitous in contemporary statistics and machine learning.

قيم البحث

79 - Bruce Hajek , Yihong Wu , Jiaming Xu 2015

The principal submatrix localization problem deals with recovering a $Ktimes K$ principal submatrix of elevated mean $mu$ in a large $ntimes n$ symmetric matrix subject to additive standard Gaussian noise. This problem serves as a prototypical exampl e for community detection, in which the community corresponds to the support of the submatrix. The main result of this paper is that in the regime $Omega(sqrt{n}) leq K leq o(n)$, the support of the submatrix can be weakly recovered (with $o(K)$ misclassification errors on average) by an optimized message passing algorithm if $lambda = mu^2K^2/n$, the signal-to-noise ratio, exceeds $1/e$. This extends a result by Deshpande and Montanari previously obtained for $K=Theta(sqrt{n}).$ In addition, the algorithm can be extended to provide exact recovery whenever information-theoretically possible and achieve the information limit of exact recovery as long as $K geq frac{n}{log n} (frac{1}{8e} + o(1))$. The total running time of the algorithm is $O(n^2log n)$. Another version of the submatrix localization problem, known as noisy biclustering, aims to recover a $K_1times K_2$ submatrix of elevated mean $mu$ in a large $n_1times n_2$ Gaussian matrix. The optimized message passing algorithm and its analysis are adapted to the bicluster problem assuming $Omega(sqrt{n_i}) leq K_i leq o(n_i)$ and $K_1asymp K_2.$ A sharp information-theoretic condition for the weak recovery of both clusters is also identified.

التعلم الالي نظرية المعلومات الشبكات الاجتماعية والمعلومات

Robust Group Synchronization via Cycle-Edge Message Passing

73 - Gilad Lerman , Yunpeng Shi 2019

We propose a general framework for solving the group synchronization problem, where we focus on the setting of adversarial or uniform corruption and sufficiently small noise. Specifically, we apply a novel message passing procedure that uses cycle co nsistency information in order to estimate the corruption levels of group ratios and consequently solve the synchronization problem in our setting. We first explain why the group cycle consistency information is essential for effectively solving group synchronization problems. We then establish exact recovery and linear convergence guarantees for the proposed message passing procedure under a deterministic setting with adversarial corruption. These guarantees hold as long as the ratio of corrupted cycles per edge is bounded by a reasonable constant. We also establish the stability of the proposed procedure to sub-Gaussian noise. We further establish exact recovery with high probability under a common uniform corruption model.

التعلم الالي نظرية المعلومات نظرية المعلومات

Bayesian Functional Principal Components Analysis via Variational Message Passing

115 - Tui H. Nolan , Jeff Goldsmith , David Ruppert 2021

Functional principal components analysis is a popular tool for inference on functional data. Standard approaches rely on an eigendecomposition of a smoothed covariance surface in order to extract the orthonormal functions representing the major modes of variation. This approach can be a computationally intensive procedure, especially in the presence of large datasets with irregular observations. In this article, we develop a Bayesian approach, which aims to determine the Karhunen-Lo`eve decomposition directly without the need to smooth and estimate a covariance surface. More specifically, we develop a variational Bayesian algorithm via message passing over a factor graph, which is more commonly referred to as variational message passing. Message passing algorithms are a powerful tool for compartmentalizing the algebra and coding required for inference in hierarchical statistical models. Recently, there has been much focus on formulating variational inference algorithms in the message passing framework because it removes the need for rederiving approximate posterior density functions if there is a change to the model. Instead, model changes are handled by changing specific computational units, known as fragments, within the factor graph. We extend the notion of variational message passing to functional principal components analysis. Indeed, this is the first article to address a functional data model via variational message passing. Our approach introduces two new fragments that are necessary for Bayesian functional principal components analysis. We present the computational details, a set of simulations for assessing accuracy and speed and an application to United States temperature data.

المنهجية

Streaming Bayesian inference: theoretical limits and mini-batch approximate message-passing

79 - Andre Manoel , Florent Krzakala , Eric W. Tramel 2017

In statistical learning for real-world large-scale data problems, one must often resort to streaming algorithms which operate sequentially on small batches of data. In this work, we present an analysis of the information-theoretic limits of mini-batc h inference in the context of generalized linear models and low-rank matrix factorization. In a controlled Bayes-optimal setting, we characterize the optimal performance and phase transitions as a function of mini-batch size. We base part of our results on a detailed analysis of a mini-batch version of the approximate message-passing algorithm (Mini-AMP), which we introduce. Additionally, we show that this theoretical optimality carries over into real-data problems by illustrating that Mini-AMP is competitive with standard streaming algorithms for clustering.

التعلم الالي الميكانيكا الإحصائية نظرية المعلومات

PCA Initialization for Approximate Message Passing in Rotationally Invariant Models

111 - Marco Mondelli , Ramji Venkataramanan 2021

We study the problem of estimating a rank-$1$ signal in the presence of rotationally invariant noise-a class of perturbations more general than Gaussian noise. Principal Component Analysis (PCA) provides a natural estimator, and sharp results on its performance have been obtained in the high-dimensional regime. Recently, an Approximate Message Passing (AMP) algorithm has been proposed as an alternative estimator with the potential to improve the accuracy of PCA. However, the existing analysis of AMP requires an initialization that is both correlated with the signal and independent of the noise, which is often unrealistic in practice. In this work, we combine the two methods, and propose to initialize AMP with PCA. Our main result is a rigorous asymptotic characterization of the performance of this estimator. Both the AMP algorithm and its analysis differ from those previously derived in the Gaussian setting: at every iteration, our AMP algorithm requires a specific term to account for PCA initialization, while in the Gaussian case, PCA initialization affects only the first iteration of AMP. The proof is based on a two-phase artificial AMP that first approximates the PCA estimator and then mimics the true AMP. Our numerical simulations show an excellent agreement between AMP results and theoretical predictions, and suggest an interesting open direction on achieving Bayes-optimal performance.

التعلم الالي نظرية المعلومات التعلم الآلي