ترغب بنشر مسار تعليمي؟ اضغط هنا

Nonnegative Matrix Factorization (NMF) with Heteroscedastic Uncertainties and Missing data

291   0   0.0 ( 0 )
 نشر من قبل Guangtun Zhu
 تاريخ النشر 2016
  مجال البحث فيزياء
والبحث باللغة English
 تأليف Guangtun Zhu




اسأل ChatGPT حول البحث

Dimensionality reduction and matrix factorization techniques are important and useful machine-learning techniques in many fields. Nonnegative matrix factorization (NMF) is particularly useful for spectral analysis and image processing in astronomy. I present the vectorized update rules and an independent proof of their convergence for NMF with heteroscedastic measurements and missing data. I release a Python implementation of the rules and use an optical spectroscopic dataset of extragalactic sources as an example for demonstration. A future paper will present results of applying the technique to image processing of planetary disks.



قيم البحث

اقرأ أيضاً

The angular differential imaging (ADI) is used to improve contrast in high resolution astronomical imaging. An example is the direct imaging of exoplanet in camera fed by Extreme Adaptive Optics. The subtraction of the main dazzling object to observe the faint companion was improved using Principal Component Analysis (PCA). It factorizes the positive astronomical frames into positive and negative components. On the contrary, the Nonnegative Matrix Factorization (NMF) uses only positive components, mimicking the actual composition of the long exposure images.
Matrix factorization (MF) has been widely used to discover the low-rank structure and to predict the missing entries of data matrix. In many real-world learning systems, the data matrix can be very high-dimensional but sparse. This poses an imbalance d learning problem, since the scale of missing entries is usually much larger than that of observed entries, but they cannot be ignored due to the valuable negative signal. For efficiency concern, existing work typically applies a uniform weight on missing entries to allow a fast learning algorithm. However, this simplification will decrease modeling fidelity, resulting in suboptimal performance for downstream applications. In this work, we weight the missing data non-uniformly, and more generically, we allow any weighting strategy on the missing data. To address the efficiency challenge, we propose a fast learning method, for which the time complexity is determined by the number of observed entries in the data matrix, rather than the matrix size. The key idea is two-fold: 1) we apply truncated SVD on the weight matrix to get a more compact representation of the weights, and 2) we learn MF parameters with element-wise alternating least squares (eALS) and memorize the key intermediate variables to avoid repeating computations that are unnecessary. We conduct extensive experiments on two recommendation benchmarks, demonstrating the correctness, efficiency, and effectiveness of our fast eALS method.
In this paper, we present several descent methods that can be applied to nonnegative matrix factorization and we analyze a recently developped fast block coordinate method called Rank-one Residue Iteration (RRI). We also give a comparison of these di fferent methods and show that the new block coordinate method has better properties in terms of approximation error and complexity. By interpreting this method as a rank-one approximation of the residue matrix, we prove that it emph{converges} and also extend it to the nonnegative tensor factorization and introduce some variants of the method by imposing some additional controllable constraints such as: sparsity, discreteness and smoothness.
In the Nonnegative Matrix Factorization (NMF) problem we are given an $n times m$ nonnegative matrix $M$ and an integer $r > 0$. Our goal is to express $M$ as $A W$ where $A$ and $W$ are nonnegative matrices of size $n times r$ and $r times m$ respec tively. In some applications, it makes sense to ask instead for the product $AW$ to approximate $M$ -- i.e. (approximately) minimize $ orm{M - AW}_F$ where $ orm{}_F$ denotes the Frobenius norm; we refer to this as Approximate NMF. This problem has a rich history spanning quantum mechanics, probability theory, data analysis, polyhedral combinatorics, communication complexity, demography, chemometrics, etc. In the past decade NMF has become enormously popular in machine learning, where $A$ and $W$ are computed using a variety of local search heuristics. Vavasis proved that this problem is NP-complete. We initiate a study of when this problem is solvable in polynomial time: 1. We give a polynomial-time algorithm for exact and approximate NMF for every constant $r$. Indeed NMF is most interesting in applications precisely when $r$ is small. 2. We complement this with a hardness result, that if exact NMF can be solved in time $(nm)^{o(r)}$, 3-SAT has a sub-exponential time algorithm. This rules out substantial improvements to the above algorithm. 3. We give an algorithm that runs in time polynomial in $n$, $m$ and $r$ under the separablity condition identified by Donoho and Stodden in 2003. The algorithm may be practical since it is simple and noise tolerant (under benign assumptions). Separability is believed to hold in many practical settings. To the best of our knowledge, this last result is the first example of a polynomial-time algorithm that provably works under a non-trivial condition on the input and we believe that this will be an interesting and important direction for future work.
Fully unsupervised topic models have found fantastic success in document clustering and classification. However, these models often suffer from the tendency to learn less-than-meaningful or even redundant topics when the data is biased towards a set of features. For this reason, we propose an approach based upon the nonnegative matrix factorization (NMF) model, deemed textit{Guided NMF}, that incorporates user-designed seed word supervision. Our experimental results demonstrate the promise of this model and illustrate that it is competitive with other methods of this ilk with only very little supervision information.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا