ترغب بنشر مسار تعليمي؟ اضغط هنا

Statistical Learning Guarantees for Compressive Clustering and Compressive Mixture Modeling

85   0   0.0 ( 0 )
 نشر من قبل Remi Gribonval
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English
 تأليف Remi Gribonval




اسأل ChatGPT حول البحث

We provide statistical learning guarantees for two unsupervised learning tasks in the context of compressive statistical learning, a general framework for resource-efficient large-scale learning that we introduced in a companion paper.The principle of compressive statistical learning is to compress a training collection, in one pass, into a low-dimensional sketch (a vector of random empirical generalized moments) that captures the information relevant to the considered learning task. We explicitly describe and analyze random feature functions which empirical averages preserve the needed information for compressive clustering and compressive Gaussian mixture modeling with fixed known variance, and establish sufficient sketch sizes given the problem dimensions.



قيم البحث

اقرأ أيضاً

180 - Yi Li , Vasileios Nakos 2017
In the compressive phase retrieval problem, or phaseless compressed sensing, or compressed sensing from intensity only measurements, the goal is to reconstruct a sparse or approximately $k$-sparse vector $x in mathbb{R}^n$ given access to $y= |Phi x| $, where $|v|$ denotes the vector obtained from taking the absolute value of $vinmathbb{R}^n$ coordinate-wise. In this paper we present sublinear-time algorithms for different variants of the compressive phase retrieval problem which are akin to the variants considered for the classical compressive sensing problem in theoretical computer science. Our algorithms use pure combinatorial techniques and near-optimal number of measurements.
Local graph clustering methods aim to find small clusters in very large graphs. These methods take as input a graph and a seed node, and they return as output a good cluster in a running time that depends on the size of the output cluster but that is independent of the size of the input graph. In this paper, we adopt a statistical perspective on local graph clustering, and we analyze the performance of the l1-regularized PageRank method~(Fountoulakis et. al.) for the recovery of a single target cluster, given a seed node inside the cluster. Assuming the target cluster has been generated by a random model, we present two results. In the first, we show that the optimal support of l1-regularized PageRank recovers the full target cluster, with bounded false positives. In the second, we show that if the seed node is connected solely to the target cluster then the optimal support of l1-regularized PageRank recovers exactly the target cluster. We also show empirically that l1-regularized PageRank has a state-of-the-art performance on many real graphs, demonstrating the superiority of the method. From a computational perspective, we show that the solution path of l1-regularized PageRank is monotonic. This allows for the application of the forward stagewise algorithm, which approximates the solution path in running time that does not depend on the size of the whole graph. Finally, we show that l1-regularized PageRank and approximate personalized PageRank (APPR), another very popular method for local graph clustering, are equivalent in the sense that we can lower and upper bound the output of one with the output of the other. Based on this relation, we establish for APPR similar results to those we establish for l1-regularized PageRank.
Compressive sensing (CS) has been widely studied and applied in many fields. Recently, the way to perform secure compressive sensing (SCS) has become a topic of growing interest. The existing works on SCS usually take the sensing matrix as a key and the resultant security level is not evaluated in depth. They can only be considered as a preliminary exploration on SCS, but a concrete and operable encipher model is not given yet. In this paper, we are going to investigate SCS in a systematic way. The relationship between CS and symmetric-key cipher indicates some possible encryption models. To this end, we propose the two-level protection models (TLPM) for SCS which are developed from measurements taking and something else, respectively. It is believed that these models will provide a new point of view and stimulate further research in both CS and cryptography. Specifically, an efficient and secure encryption scheme for parallel compressive sensing (PCS) is designed by embedding a two-layer protection in PCS using chaos. The first layer is undertaken by random permutation on a two-dimensional signal, which is proved to be an acceptable permutation with overwhelming probability. The other layer is to sample the permuted signal column by column with the same chaotic measurement matrix, which satisfies the restricted isometry property of PCS with overwhelming probability. Both the random permutation and the measurement matrix are constructed under the control of a chaotic system. Simulation results show that unlike the general joint compression and encryption schemes in which encryption always leads to the same or a lower compression ratio, the proposed approach of embedding encryption in PCS actually improves the compression performance. Besides, the proposed approach possesses high transmission robustness against additive Gaussian white noise and cropping attack.
Compressive Learning is an emerging topic that combines signal acquisition via compressive sensing and machine learning to perform inference tasks directly on a small number of measurements. Many data modalities naturally have a multi-dimensional or tensorial format, with each dimension or tensor mode representing different features such as the spatial and temporal information in video sequences or the spatial and spectral information in hyperspectral images. However, in existing compressive learning frameworks, the compressive sensing component utilizes either random or learned linear projection on the vectorized signal to perform signal acquisition, thus discarding the multi-dimensional structure of the signals. In this paper, we propose Multilinear Compressive Learning, a framework that takes into account the tensorial nature of multi-dimensional signals in the acquisition step and builds the subsequent inference model on the structurally sensed measurements. Our theoretical complexity analysis shows that the proposed framework is more efficient compared to its vector-based counterpart in both memory and computation requirement. With extensive experiments, we also empirically show that our Multilinear Compressive Learning framework outperforms the vector-based framework in object classification and face recognition tasks, and scales favorably when the dimensionalities of the original signals increase, making it highly efficient for high-dimensional multi-dimensional signals.
In coded aperture snapshot spectral imaging (CASSI) system, the real-world hyperspectral image (HSI) can be reconstructed from the captured compressive image in a snapshot. Model-based HSI reconstruction methods employed hand-crafted priors to solve the reconstruction problem, but most of which achieved limited success due to the poor representation capability of these hand-crafted priors. Deep learning based methods learning the mappings between the compressive images and the HSIs directly achieved much better results. Yet, it is nontrivial to design a powerful deep network heuristically for achieving satisfied results. In this paper, we propose a novel HSI reconstruction method based on the Maximum a Posterior (MAP) estimation framework using learned Gaussian Scale Mixture (GSM) prior. Different from existing GSM models using hand-crafted scale priors (e.g., the Jeffreys prior), we propose to learn the scale prior through a deep convolutional neural network (DCNN). Furthermore, we also propose to estimate the local means of the GSM models by the DCNN. All the parameters of the MAP estimation algorithm and the DCNN parameters are jointly optimized through end-to-end training. Extensive experimental results on both synthetic and real datasets demonstrate that the proposed method outperforms existing state-of-the-art methods. The code is available at https://see.xidian.edu.cn/faculty/wsdong/Projects/DGSM-SCI.htm.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا