Two-way Spectrum Pursuit for CUR Decomposition and Its Application in Joint Column/Row Subset Selection

41 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ashkan Esmaeili

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ashkan Esmaeili - Mohsen Joneidi - Mehrdad Salimitari

التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The problem of simultaneous column and row subset selection is addressed in this paper. The column space and row space of a matrix are spanned by its left and right singular vectors, respectively. However, the singular vectors are not within actual columns/rows of the matrix. In this paper, an iterative approach is proposed to capture the most structural information of columns/rows via selecting a subset of actual columns/rows. This algorithm is referred to as two-way spectrum pursuit (TWSP) which provides us with an accurate solution for the CUR matrix decomposition. TWSP is applicable in a wide range of applications since it enjoys a linear complexity w.r.t. number of original columns/rows. We demonstrated the application of TWSP for joint channel and sensor selection in cognitive radio networks, informative users and contents detection, and efficient supervised data reduction.

قيم البحث

92 - Changsheng Li , Xiangfeng Wang , Weishan Dong 2015

This paper presents an unsupervised learning approach for simultaneous sample and feature selection, which is in contrast to existing works which mainly tackle these two problems separately. In fact the two tasks are often interleaved with each other : noisy and high-dimensional features will bring adverse effect on sample selection, while informative or representative samples will be beneficial to feature selection. Specifically, we propose a framework to jointly conduct active learning and feature selection based on the CUR matrix decomposition. From the data reconstruction perspective, both the selected samples and features can best approximate the original dataset respectively, such that the selected samples characterized by the features are highly representative. In particular, our method runs in one-shot without the procedure of iterative sample selection for progressive labeling. Thus, our model is especially suitable when there are few labeled samples or even in the absence of supervision, which is a particular challenge for existing methods. As the joint learning problem is NP-hard, the proposed formulation involves a convex but non-smooth optimization problem. We solve it efficiently by an iterative algorithm, and prove its global convergence. Experimental results on publicly available datasets corroborate the efficacy of our method compared with the state-of-the-art.

التعلم الآلي

Streaming and Distributed Algorithms for Robust Column Subset Selection

126 - Shuli Jiang , Dongyu Li , Irene Mengze Li 2021

We give the first single-pass streaming algorithm for Column Subset Selection with respect to the entrywise $ell_p$-norm with $1 leq p < 2$. We study the $ell_p$ norm loss since it is often considered more robust to noise than the standard Frobenius norm. Given an input matrix $A in mathbb{R}^{d times n}$ ($n gg d$), our algorithm achieves a multiplicative $k^{frac{1}{p} - frac{1}{2}}text{poly}(log nd)$-approximation to the error with respect to the best possible column subset of size $k$. Furthermore, the space complexity of the streaming algorithm is optimal up to a logarithmic factor. Our streaming algorithm also extends naturally to a 1-round distributed protocol with nearly optimal communication cost. A key ingredient in our algorithms is a reduction to column subset selection in the $ell_{p,2}$-norm, which corresponds to the $p$-norm of the vector of Euclidean norms of each of the columns of $A$. This enables us to leverage strong coreset constructions for the Euclidean norm, which previously had not been applied in this context. We also give the first provable guarantees for greedy column subset selection in the $ell_{1, 2}$ norm, which can be used as an alternative, practical subroutine in our algorithms. Finally, we show that our algorithms give significant practical advantages on real-world data analysis tasks.

بنى وهياكل البيانات والخوارزميات

Fast Robust Tensor Principal Component Analysis via Fiber CUR Decomposition

131 - HanQin Cai , Zehan Chao , Longxiu Huang 2021

We study the problem of tensor robust principal component analysis (TRPCA), which aims to separate an underlying low-multilinear-rank tensor and a sparse outlier tensor from their sum. In this work, we propose a fast non-convex algorithm, coined Robu st Tensor CUR (RTCUR), for large-scale TRPCA problems. RTCUR considers a framework of alternating projections and utilizes the recently developed tensor Fiber CUR decomposition to dramatically lower the computational complexity. The performance advantage of RTCUR is empirically verified against the state-of-the-arts on the synthetic datasets and is further demonstrated on the real-world application such as color video background subtraction.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

Robust CUR Decomposition: Theory and Imaging Applications

422 - HanQin Cai , Keaton Hamm , Longxiu Huang 2021

This paper considers the use of Robust PCA in a CUR decomposition framework and applications thereof. Our main algorithms produce a robust version of column-row factorizations of matrices $mathbf{D}=mathbf{L}+mathbf{S}$ where $mathbf{L}$ is low-rank and $mathbf{S}$ contains sparse outliers. These methods yield interpretable factorizations at low computational cost, and provide new CUR decompositions that are robust to sparse outliers, in contrast to previous methods. We consider two key imaging applications of Robust PCA: video foreground-background separation and face modeling. This paper examines the qualitative behavior of our Robust CUR decompositions on the benchmark videos and face datasets, and find that our method works as well as standard Robust PCA while being significantly faster. Additionally, we consider hybrid randomized and deterministic sampling methods which produce a compact CUR decomposition of a given matrix, and apply this to video sequences to produce canonical frames thereof.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

Orthogonal Matching Pursuit with Thresholding and its Application in Compressive Sensing

114 - Mingrui Yang , Frank de Hoog 2013

Greed is good. However, the tighter you squeeze, the less you have. In this paper, a less greedy algorithm for sparse signal reconstruction in compressive sensing, named orthogonal matching pursuit with thresholding is studied. Using the global 2-coh erence , which provides a bridge between the well known mutual coherence and the restricted isometry constant, the performance of orthogonal matching pursuit with thresholding is analyzed and more general results for sparse signal reconstruction are obtained. It is also shown that given the same assumption on the coherence index and the restricted isometry constant as required for orthogonal matching pursuit, the thresholding variation gives exactly the same reconstruction performance with significantly less complexity.

نظرية المعلومات نظرية المعلومات