Fast Bayesian Non-Negative Matrix Factorisation and Tri-Factorisation

61 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Thomas Brouwer

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Thomas Brouwer - Jes Frellsen - Pietro Lio

التعلم الآلي الذكاء الاصطناعي التحليل العددي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present a fast variational Bayesian algorithm for performing non-negative matrix factorisation and tri-factorisation. We show that our approach achieves faster convergence per iteration and timestep (wall-clock) than Gibbs sampling and non-probabilistic approaches, and do not require additional samples to estimate the posterior. We show that in particular for matrix tri-factorisation convergence is difficult, but our variational Bayesian approach offers a fast solution, allowing the tri-factorisation approach to be used more effectively.

قيم البحث

180 - Shuai Jiang , Kan Li , Richard Yida Xu 2018

Non-negative Matrix Factorisation (NMF) has been extensively used in machine learning and data analytics applications. Most existing variations of NMF only consider how each row/column vector of factorised matrices should be shaped, and ignore the re lationship among pairwise rows or columns. In many cases, such pairwise relationship enables better factorisation, for example, image clustering and recommender systems. In this paper, we propose an algorithm named, Relative Pairwise Relationship constrained Non-negative Matrix Factorisation (RPR-NMF), which places constraints over relative pairwise distances amongst features by imposing penalties in a triplet form. Two distance measures, squared Euclidean distance and Symmetric divergence, are used, and exponential and hinge loss penalties are adopted for the two measures respectively. It is well known that the so-called multiplicative update rules result in a much faster convergence than gradient descend for matrix factorisation. However, applying such update rules to RPR-NMF and also proving its convergence is not straightforward. Thus, we use reasonable approximations to relax the complexity brought by the penalties, which are practically verified. Experiments on both synthetic datasets and real datasets demonstrate that our algorithms have advantages on gaining close approximation, satisfying a high proportion of expected constraints, and achieving superior performance compared with other algorithms.

التعلم الآلي

Learning Multimorbidity Patterns from Electronic Health Records Using Non-negative Matrix Factorisation

345 - Abdelaali Hassaine , Dexter Canoy , Jose Roberto Ayala Solares 2019

Multimorbidity, or the presence of several medical conditions in the same individual, has been increasing in the population, both in absolute and relative terms. However, multimorbidity remains poorly understood, and the evidence from existing resear ch to describe its burden, determinants and consequences has been limited. Previous studies attempting to understand multimorbidity patterns are often cross-sectional and do not explicitly account for multimorbidity patterns evolution over time; some of them are based on small datasets and/or use arbitrary and narrow age ranges; and those that employed advanced models, usually lack appropriate benchmarking and validations. In this study, we (1) introduce a novel approach for using Non-negative Matrix Factorisation (NMF) for temporal phenotyping (i.e., simultaneously mining disease clusters and their trajectories); (2) provide quantitative metrics for the evaluation of disease clusters from such studies; and (3) demonstrate how the temporal characteristics of the disease clusters that result from our model can help mine multimorbidity networks and generate new hypotheses for the emergence of various multimorbidity patterns over time. We trained and evaluated our models on one of the worlds largest electronic health records (EHR), with 7 million patients, from which over 2 million where relevant to this study.

التعلم الالي التعلم الآلي

Magnitude Bounded Matrix Factorisation for Recommender Systems

158 - Shuai Jiang , Kan Li , Richard Yi Da Xu 2018

Low rank matrix factorisation is often used in recommender systems as a way of extracting latent features. When dealing with large and sparse datasets, traditional recommendation algorithms face the problem of acquiring large, unrestrained, fluctuati ng values over predictions especially for users/items with very few corresponding observations. Although the problem has been somewhat solved by imposing bounding constraints over its objectives, and/or over all entries to be within a fixed range, in terms of gaining better recommendations, these approaches have two major shortcomings that we aim to mitigate in this work: one is they can only deal with one pair of fixed bounds for all entries, and the other one is they are very time-consuming when applied on large scale recommender systems. In this paper, we propose a novel algorithm named Magnitude Bounded Matrix Factorisation (MBMF), which allows different bounds for individual users/items and performs very fast on large scale datasets. The key idea of our algorithm is to construct a model by constraining the magnitudes of each individual user/item feature vector. We achieve this by converting from the Cartesian to Spherical coordinate system with radii set as the corresponding magnitudes, which allows the above constrained optimisation problem to become an unconstrained one. The Stochastic Gradient Descent (SGD) method is then applied to solve the unconstrained task efficiently. Experiments on synthetic and real datasets demonstrate that in most cases the proposed MBMF is superior over all existing algorithms in terms of accuracy and time complexity.

التعلم الآلي التعلم الالي

Learning the Fundamental MIR Spectral Components of Galaxies with Non-Negative Matrix Factorisation

352 - P.D. Hurley , S.Oliver , D. Farrah 2013

The mid-infrared (MIR) spectra observed with the textit{Spitzer} Infrared Spectrograph (IRS) provide a valuable dataset for untangling the physical processes and conditions within galaxies. This paper presents the first attempt to blindly learn fun damental spectral components of MIR galaxy spectra, using non-negative matrix factorisation (NMF). NMF is a recently developed multivariate technique shown to be successful in blind source separation problems. Unlike the more popular multivariate analysis technique, principal component analysis, NMF imposes the condition that weights and spectral components are non-negative. This more closely resembles the physical process of emission in the mid-infrared, resulting in physically intuitive components. By applying NMF to galaxy spectra in the Cornell Atlas of Spitzer/IRS sources (CASSIS), we find similar components amongst different NMF sets. These similar components include two for AGN emission and one for star formation. [... ABBREVIATED...] We show an NMF set with seven components can reconstruct the general spectral shape of a wide variety of objects, though struggle to fit the varying strength of emission lines. We also show that the seven components can be used to separate out different types of objects. We model this separation with Gaussian Mixtures modelling and use the result to provide a classification tool. We also show the NMF components can be used to separate out the emission from AGN and star formation regions and define a new star formation/AGN diagnostic which is consistent with all mid-infrared diagnostics already in use but has the advantage that it can be applied to mid-infrared spectra with low signal to noise or with limited spectral range. The 7 NMF components and code for classification are made public on arxiv and are available at: url{https://github.com/pdh21/NMF_software/}

علم الكونيات والفيزياء الفلكية Nongalactic

Latent Space Factorisation and Manipulation via Matrix Subspace Projection

89 - Xiao Li , Chenghua Lin , Ruizhe Li 2019

We tackle the problem disentangling the latent space of an autoencoder in order to separate labelled attribute information from other characteristic information. This then allows us to change selected attributes while preserving other information. Ou r method, matrix subspace projection, is much simpler than previous approaches to latent space factorisation, for example not requiring multiple discriminators or a careful weighting among their loss functions. Furthermore our new model can be applied to autoencoders as a plugin, and works across diverse domains such as images or text. We demonstrate the utility of our method for attribute manipulation in autoencoders trained across varied domains, using both human evaluation and automated methods. The quality of generation of our new model (e.g. reconstruction, conditional generation) is highly competitive to a number of strong baselines.

التعلم الآلي التعلم الالي