ترغب بنشر مسار تعليمي؟ اضغط هنا

Stable safe screening and structured dictionaries for faster L1 regularization

56   0   0.0 ( 0 )
 نشر من قبل Cassio Dantas
 تاريخ النشر 2018
والبحث باللغة English




اسأل ChatGPT حول البحث

In this paper, we propose a way to combine two acceleration techniques for the $ell_{1}$-regularized least squares problem: safe screening tests, which allow to eliminate useless dictionary atoms; and the use of fast structured approximations of the dictionary matrix. To do so, we introduce a new family of screening tests, termed stable screening, which can cope with approximation errors on the dictionary atoms while keeping the safety of the test (i.e. zero risk of rejecting atoms belonging to the solution support). Some of the main existing screening tests are extended to this new framework. The proposed algorithm consists in using a coarser (but faster) approximation of the dictionary at the initial iterations and then switching to better approximations until eventually adopting the original dictionary. A systematic switching criterion based on the duality gap saturation and the screening ratio is derived.Simulation results show significant reductions in both computational complexity and execution times for a wide range of tested scenarios.



قيم البحث

اقرأ أيضاً

We propose $ell_1$ norm regularized quadratic surface support vector machine models for binary classification in supervised learning. We establish their desired theoretical properties, including the existence and uniqueness of the optimal solution, r eduction to the standard SVMs over (almost) linearly separable data sets, and detection of true sparsity pattern over (almost) quadratically separable data sets if the penalty parameter of $ell_1$ norm is large enough. We also demonstrate their promising practical efficiency by conducting various numerical experiments on both synthetic and publicly available benchmark data sets.
Joint network topology inference represents a canonical problem of jointly learning multiple graph Laplacian matrices from heterogeneous graph signals. In such a problem, a widely employed assumption is that of a simple common component shared among multiple networks. However, in practice, a more intricate topological pattern, comprising simultaneously of sparse, homogeneity and heterogeneity components, would exhibit in multiple networks. In this paper, we propose a general graph estimator based on a novel structured fusion regularization that enables us to jointly learn multiple graph Laplacian matrices with such complex topological patterns, and enjoys both high computational efficiency and rigorous theoretical guarantee. Moreover, in the proposed regularization term, the topological pattern among networks is characterized by a Gram matrix, endowing our graph estimator with the ability of flexible modelling different types of topological patterns by different choices of the Gram matrix. Computationally, the regularization term, coupling the parameters together, makes the formulated optimization problem intractable and thus, we develop a computationally-scalable algorithm based on the alternating direction method of multipliers (ADMM) to solve it efficiently. Theoretically, we provide a theoretical analysis of the proposed graph estimator, which establishes a non-asymptotic bound of the estimation error under the high-dimensional setting and reflects the effect of several key factors on the convergence rate of our algorithm. Finally, the superior performance of the proposed method is illustrated through simulated and real data examples.
In sparse signal representation, the choice of a dictionary often involves a tradeoff between two desirable properties -- the ability to adapt to specific signal data and a fast implementation of the dictionary. To sparsely represent signals residing on weighted graphs, an additional design challenge is to incorporate the intrinsic geometric structure of the irregular data domain into the atoms of the dictionary. In this work, we propose a parametric dictionary learning algorithm to design data-adapted, structured dictionaries that sparsely represent graph signals. In particular, we model graph signals as combinations of overlapping local patterns. We impose the constraint that each dictionary is a concatenation of subdictionaries, with each subdictionary being a polynomial of the graph Laplacian matrix, representing a single pattern translated to different areas of the graph. The learning algorithm adapts the patterns to a training set of graph signals. Experimental results on both synthetic and real datasets demonstrate that the dictionaries learned by the proposed algorithm are competitive with and often better than unstructured dictionaries learned by state-of-the-art numerical learning algorithms in terms of sparse approximation of graph signals. In contrast to the unstructured dictionaries, however, the dictionaries learned by the proposed algorithm feature localized atoms and can be implemented in a computationally efficient manner in signal processing tasks such as compression, denoising, and classification.
We propose and evaluate new techniques for compressing and speeding up dense matrix multiplications as found in the fully connected and recurrent layers of neural networks for embedded large vocabulary continuous speech recognition (LVCSR). For compr ession, we introduce and study a trace norm regularization technique for training low rank factor
A deep neural network model is a powerful framework for learning representations. Usually, it is used to learn the relation $x to y$ by exploiting the regularities in the input $x$. In structured output prediction problems, $y$ is multi-dimensional a nd structural relations often exist between the dimensions. The motivation of this work is to learn the output dependencies that may lie in the output data in order to improve the prediction accuracy. Unfortunately, feedforward networks are unable to exploit the relations between the outputs. In order to overcome this issue, we propose in this paper a regularization scheme for training neural networks for these particular tasks using a multi-task framework. Our scheme aims at incorporating the learning of the output representation $y$ in the training process in an unsupervised fashion while learning the supervised mapping function $x to y$. We evaluate our framework on a facial landmark detection problem which is a typical structured output task. We show over two public challenging datasets (LFPW and HELEN) that our regularization scheme improves the generalization of deep neural networks and accelerates their training. The use of unlabeled data and label-only data is also explored, showing an additional improvement of the results. We provide an opensource implementation (https://github.com/sbelharbi/structured-output-ae) of our framework.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا