Do you want to publish a course? Click here

Learning overcomplete, low coherence dictionaries with linear inference

339   0   0.0 ( 0 )
 Added by Jesse Livezey
 Publication date 2016
and research's language is English




Ask ChatGPT about the research

Finding overcomplete latent representations of data has applications in data analysis, signal processing, machine learning, theoretical neuroscience and many other fields. In an overcomplete representation, the number of latent features exceeds the data dimensionality, which is useful when the data is undersampled by the measurements (compressed sensing, information bottlenecks in neural systems) or composed from multiple complete sets of linear features, each spanning the data space. Independent Components Analysis (ICA) is a linear technique for learning sparse latent representations, which typically has a lower computational cost than sparse coding, its nonlinear, recurrent counterpart. While well suited for finding complete representations, we show that overcompleteness poses a challenge to existing ICA algorithms. Specifically, the coherence control in existing ICA algorithms, necessary to prevent the formation of duplicate dictionary features, is ill-suited in the overcomplete case. We show that in this case several existing ICA algorithms have undesirable global minima that maximize coherence. Further, by comparing ICA algorithms on synthetic data and natural images to the computationally more expensive sparse coding solution, we show that the coherence control biases the exploration of the data manifold, sometimes yielding suboptimal solutions. We provide a theoretical explanation of these failures and, based on the theory, propose improved overcomplete ICA algorithms. All told, this study contributes new insights into and methods for coherence control for linear ICA, some of which are applicable to many other, potentially nonlinear, unsupervised learning methods.



rate research

Read More

We study the problem of learning overcomplete HMMs---those that have many hidden states but a small output alphabet. Despite having significant practical importance, such HMMs are poorly understood with no known positive or negative results for efficient learning. In this paper, we present several new results---both positive and negative---which help define the boundaries between the tractable and intractable settings. Specifically, we show positive results for a large subclass of HMMs whose transition matrices are sparse, well-conditioned, and have small probability mass on short cycles. On the other hand, we show that learning is impossible given only a polynomial number of samples for HMMs with a small output alphabet and whose transition matrices are random regular graphs with large degree. We also discuss these results in the context of learning HMMs which can capture long-term dependencies.
In sparse signal representation, the choice of a dictionary often involves a tradeoff between two desirable properties -- the ability to adapt to specific signal data and a fast implementation of the dictionary. To sparsely represent signals residing on weighted graphs, an additional design challenge is to incorporate the intrinsic geometric structure of the irregular data domain into the atoms of the dictionary. In this work, we propose a parametric dictionary learning algorithm to design data-adapted, structured dictionaries that sparsely represent graph signals. In particular, we model graph signals as combinations of overlapping local patterns. We impose the constraint that each dictionary is a concatenation of subdictionaries, with each subdictionary being a polynomial of the graph Laplacian matrix, representing a single pattern translated to different areas of the graph. The learning algorithm adapts the patterns to a training set of graph signals. Experimental results on both synthetic and real datasets demonstrate that the dictionaries learned by the proposed algorithm are competitive with and often better than unstructured dictionaries learned by state-of-the-art numerical learning algorithms in terms of sparse approximation of graph signals. In contrast to the unstructured dictionaries, however, the dictionaries learned by the proposed algorithm feature localized atoms and can be implemented in a computationally efficient manner in signal processing tasks such as compression, denoising, and classification.
Modeling the temporal behavior of data is of primordial importance in many scientific and engineering fields. Baseline methods assume that both the dynamic and observation equations follow linear-Gaussian models. However, there are many real-world processes that cannot be characterized by a single linear behavior. Alternatively, it is possible to consider a piecewise-linear model which, combined with a switching mechanism, is well suited when several modes of behavior are needed. Nevertheless, switching dynamical systems are intractable because of their computational complexity increases exponentially with time. In this paper, we propose a variational approximation of piecewise linear dynamical systems. We provide full details of the derivation of two variational expectation-maximization algorithms, a filter and a smoother. We show that the model parameters can be split into two sets, static and dynamic parameters, and that the former parameters can be estimated off-line together with the number of linear modes, or the number of states of the switching variable. We apply the proposed method to a visual tracking problem, namely head-pose tracking, and we thoroughly compare our algorithm with several state of the art trackers.
202 - Yinghua Li , Bin Song , Jie Guo 2019
Recently, the Magnetic Resonance Imaging (MRI) images have limited and unsatisfactory resolutions due to various constraints such as physical, technological and economic considerations. Super-resolution techniques can obtain high-resolution MRI images. The traditional methods obtained the resolution enhancement of brain MRI by interpolations, affecting the accuracy of the following diagnose process. The requirement for brain image quality is fast increasing. In this paper, we propose an image super-resolution (SR) method based on overcomplete dictionaries and inherent similarity of an image to recover the high-resolution (HR) image from a single low-resolution (LR) image. We explore the nonlocal similarity of the image to tentatively search for similar blocks in the whole image and present a joint reconstruction method based on compressive sensing (CS) and similarity constraints. The sparsity and self-similarity of the image blocks are taken as the constraints. The proposed method is summarized in the following steps. First, a dictionary classification method based on the measurement domain is presented. The image blocks are classified into smooth, texture and edge parts by analyzing their features in the measurement domain. Then, the corresponding dictionaries are trained using the classified image blocks. Equally important, in the reconstruction part, we use the CS reconstruction method to recover the HR brain MRI image, considering both nonlocal similarity and the sparsity of an image as the constraints. This method performs better both visually and quantitatively than some existing methods.
395 - Qing Qu , Yuexiang Zhai , Xiao Li 2019
We study nonconvex optimization landscapes for learning overcomplete representations, including learning (i) sparsely used overcomplete dictionaries and (ii) convolutional dictionaries, where these unsupervised learning problems find many applications in high-dimensional data analysis. Despite the empirical success of simple nonconvex algorithms, theoretical justifications of why these methods work so well are far from satisfactory. In this work, we show these problems can be formulated as $ell^4$-norm optimization problems with spherical constraint, and study the geometric properties of their nonconvex optimization landscapes. For both problems, we show the nonconvex objectives have benign (global) geometric structures, in the sense that every local minimizer is close to one of the target solutions and every saddle point exhibits negative curvature. This discovery enables the development of guaranteed global optimization methods using simple initializations. For both problems, we show the nonconvex objectives have benign geometric structures -- every local minimizer is close to one of the target solutions and every saddle point exhibits negative curvature -- either in the entire space or within a sufficiently large region. This discovery ensures local search algorithms (such as Riemannian gradient descent) with simple initializations approximately find the target solutions. Finally, numerical experiments justify our theoretical discoveries.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا