No Arabic abstract
State-of-the-art methods for Convolutional Sparse Coding usually employ Fourier-domain solvers in order to speed up the convolution operators. However, this approach is not without shortcomings. For example, Fourier-domain representations implicitly assume circular boundary conditions and make it hard to fully exploit the sparsity of the problem as well as the small spatial support of the filters. In this work, we propose a novel stochastic spatial-domain solver, in which a randomized subsampling strategy is introduced during the learning sparse codes. Afterwards, we extend the proposed strategy in conjunction with online learning, scaling the CSC model up to very large sample sizes. In both cases, we show experimentally that the proposed subsampling strategy, with a reasonable selection of the subsampling rate, outperforms the state-of-the-art frequency-domain solvers in terms of execution time without losing the learning quality. Finally, we evaluate the effectiveness of the over-complete dictionary learned from large-scale datasets, which demonstrates an improved sparse representation of the natural images on account of more abundant learned image features.
Many imaging technologies rely on tomographic reconstruction, which requires solving a multidimensional inverse problem given a finite number of projections. Backprojection is a popular class of algorithm for tomographic reconstruction, however it typically results in poor image reconstructions when the projection angles are sparse and/or if the sensors characteristics are not uniform. Several deep learning based algorithms have been developed to solve this inverse problem and reconstruct the image using a limited number of projections. However these algorithms typically require examples of the ground-truth (i.e. examples of reconstructed images) to yield good performance. In this paper, we introduce an unsupervised sparse-view backprojection algorithm, which does not require ground-truth. The algorithm consists of two modules in a generator-projector framework; a convolutional neural network and a spatial transformer network. We evaluated our algorithm using computed tomography (CT) images of the human chest. We show that our algorithm significantly out-performs filtered backprojection when the projection angles are very sparse, as well as when the sensor characteristics vary for different angles. Our approach has practical applications for medical imaging and other imaging modalities (e.g. radar) where sparse and/or non-uniform projections may be acquired due to time or sampling constraints.
Tensor data often suffer from missing value problem due to the complex high-dimensional structure while acquiring them. To complete the missing information, lots of Low-Rank Tensor Completion (LRTC) methods have been proposed, most of which depend on the low-rank property of tensor data. In this way, the low-rank component of the original data could be recovered roughly. However, the shortcoming is that the detail information can not be fully restored, no matter the Sum of the Nuclear Norm (SNN) nor the Tensor Nuclear Norm (TNN) based methods. On the contrary, in the field of signal processing, Convolutional Sparse Coding (CSC) can provide a good representation of the high-frequency component of the image, which is generally associated with the detail component of the data. Nevertheless, CSC can not handle the low-frequency component well. To this end, we propose two novel methods, LRTC-CSC-I and LRTC-CSC-II, which adopt CSC as a supplementary regularization for LRTC to capture the high-frequency components. Therefore, the LRTC-CSC methods can not only solve the missing value problem but also recover the details. Moreover, the regularizer CSC can be trained with small samples due to the sparsity characteristic. Extensive experiments show the effectiveness of LRTC-CSC methods, and quantitative evaluation indicates that the performance of our models are superior to state-of-the-art methods.
The recently proposed multi-layer sparse model has raised insightful connections between sparse representations and convolutional neural networks (CNN). In its original conception, this model was restricted to a cascade of convolutional synthesis representations. In this paper, we start by addressing a more general model, revealing interesting ties to fully connected networks. We then show that this multi-layer construction admits a brand new interpretation in a unique symbiosis between synthesis and analysis models: while the deepest layer indeed provides a synthesis representation, the mid-layers decompositions provide an analysis counterpart. This new perspective exposes the suboptimality of previously proposed pursuit approaches, as they do not fully leverage all the information comprised in the model constraints. Armed with this understanding, we address fundamental theoretical issues, revisiting previous analysis and expanding it. Motivated by the limitations of previous algorithms, we then propose an integrated - holistic - alternative that estimates all representations in the model simultaneously, and analyze all these different schemes under stochastic noise assumptions. Inspired by the synthesis-analysis duality, we further present a Holistic Pursuit algorithm, which alternates between synthesis and analysis sparse coding steps, eventually solving for the entire model as a whole, with provable improved performance. Finally, we present numerical results that demonstrate the practical advantages of our approach.
We propose a new method for learning word representations using hierarchical regularization in sparse coding inspired by the linguistic study of word meanings. We show an efficient learning algorithm based on stochastic proximal methods that is significantly faster than previous approaches, making it possible to perform hierarchical sparse coding on a corpus of billions of word tokens. Experiments on various benchmark tasks---word similarity ranking, analogies, sentence completion, and sentiment analysis---demonstrate that the method outperforms or is competitive with state-of-the-art methods. Our word representations are available at url{http://www.ark.cs.cmu.edu/dyogatam/wordvecs/}.
Convolutional sparse coding (CSC) can learn representative shift-invariant patterns from multiple kinds of data. However, existing CSC methods can only model noises from Gaussian distribution, which is restrictive and unrealistic. In this paper, we propose a general CSC model capable of dealing with complicated unknown noise. The noise is now modeled by Gaussian mixture model, which can approximate any continuous probability density function. We use the expectation-maximization algorithm to solve the problem and design an efficient method for the weighted CSC problem in maximization step. The crux is to speed up the convolution in the frequency domain while keeping the other computation involving weight matrix in the spatial domain. Besides, we simultaneously update the dictionary and codes by nonconvex accelerated proximal gradient algorithm without bringing in extra alternating loops. The resultant method obtains comparable time and space complexity compared with existing CSC methods. Extensive experiments on synthetic and real noisy biomedical data sets validate that our method can model noise effectively and obtain high-quality filters and representation.