Complete Dictionary Learning via $ell^4$-Norm Maximization over the Orthogonal Group

137 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yuexiang Zhai

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية هندسة إلكترونية

والبحث باللغة English

تأليف Yuexiang Zhai - Zitong Yang - Zhenyu Liao

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper considers the fundamental problem of learning a complete (orthogonal) dictionary from samples of sparsely generated signals. Most existing methods solve the dictionary (and sparse representations) based on heuristic algorithms, usually without theoretical guarantees for either optimality or complexity. The recent $ell^1$-minimization based methods do provide such guarantees but the associated algorithms recover the dictionary one column at a time. In this work, we propose a new formulation that maximizes the $ell^4$-norm over the orthogonal group, to learn the entire dictionary. We prove that under a random data model, with nearly minimum sample complexity, the global optima of the $ell^4$ norm are very close to signed permutations of the ground truth. Inspired by this observation, we give a conceptually simple and yet effective algorithm based on matching, stretching, and projection (MSP). The algorithm provably converges locally at a superlinear (cubic) rate and cost per iteration is merely an SVD. In addition to strong theoretical guarantees, experiments show that the new algorithm is significantly more efficient and effective than existing methods, including KSVD and $ell^1$-based methods. Preliminary experimental results on mixed real imagery data clearly demonstrate advantages of so learned dictionary over classic PCA bases.

قيم البحث

82 - Yifei Shen , Ye Xue , Jun Zhang 2020

Dictionary learning is a classic representation learning method that has been widely applied in signal processing and data analytics. In this paper, we investigate a family of $ell_p$-norm ($p>2,p in mathbb{N}$) maximization approaches for the comple te dictionary learning problem from theoretical and algorithmic aspects. Specifically, we prove that the global maximizers of these formulations are very close to the true dictionary with high probability, even when Gaussian noise is present. Based on the generalized power method (GPM), an efficient algorithm is then developed for the $ell_p$-based formulations. We further show the efficacy of the developed algorithm: for the population GPM algorithm over the sphere constraint, it first quickly enters the neighborhood of a global maximizer, and then converges linearly in this region. Extensive experiments will demonstrate that the $ell_p$-based approaches enjoy a higher computational efficiency and better robustness than conventional approaches and $p=3$ performs the best.

التعلم الآلي نظرية المعلومات معالجة الإشارات

Online Orthogonal Dictionary Learning Based on Frank-Wolfe Method

438 - Ye Xue , Vincent Lau 2021

Dictionary learning is a widely used unsupervised learning method in signal processing and machine learning. Most existing works of dictionary learning are in an offline manner. There are mainly two offline ways for dictionary learning. One is to do an alternative optimization of both the dictionary and the sparse code; the other way is to optimize the dictionary by restricting it over the orthogonal group. The latter one is called orthogonal dictionary learning which has a lower complexity implementation, hence, it is more favorable for lowcost devices. However, existing schemes on orthogonal dictionary learning only work with batch data and can not be implemented online, which is not applicable for real-time applications. This paper proposes a novel online orthogonal dictionary scheme to dynamically learn the dictionary from streaming data without storing the historical data. The proposed scheme includes a novel problem formulation and an efficient online algorithm design with convergence analysis. In the problem formulation, we relax the orthogonal constraint to enable an efficient online algorithm. In the algorithm design, we propose a new Frank-Wolfe-based online algorithm with a convergence rate of O(ln t/t^(1/4)). The convergence rate in terms of key system parameters is also derived. Experiments with synthetic data and real-world sensor readings demonstrate the effectiveness and efficiency of the proposed online orthogonal dictionary learning scheme.

التعلم الآلي معالجة الإشارات التحسين والتحكم

Denoising Poisson Phaseless Measurements via Orthogonal Dictionary Learning

52 - Huibin Chang , Stefano Marchesini 2016

Phaseless diffraction measurements recorded by a CCD detector are often affected by Poisson noise. In this paper, we propose a dictionary learning model by employing patches based sparsity to denoise Poisson phaseless measurement. The model consists of three terms: (i) A representation term by an orthogonal dictionary, (ii) an $L^0$ pseudo norm of coefficient matrix, and (iii) a Kullback-Leibler divergence to fit phaseless Poisson data. Fast Alternating Minimization Method (AMM) and Proximal Alternating Linearized Minimization (PALM) are adopted to solve the established model with convergence guarantee, and especially global convergence for PALM is derived. The subproblems for two algorithms have fast solvers, and indeed, the solutions for the sparse coding and dictionary updating both have closed forms due to the orthogonality of learned dictionaries. Numerical experiments for phase retrieval using coded diffraction and ptychographic patterns are performed to show the efficiency and robustness of proposed methods, which, by preserving texture features, produce visually and quantitatively improved denoised images compared with other phase retrieval algorithms without regularization and local sparsity promoting algorithms.

التحسين والتحكم

Recovery and Generalization in Over-Realized Dictionary Learning

122 - Jeremias Sulam , Chong You , Zhihui Zhu 2020

In over two decades of research, the field of dictionary learning has gathered a large collection of successful applications, and theoretical guarantees for model recovery are known only whenever optimization is carried out in the same model class as that of the underlying dictionary. This work characterizes the surprising phenomenon that dictionary recovery can be facilitated by searching over the space of larger over-realized models. This observation is general and independent of the specific dictionary learning algorithm used. We thoroughly demonstrate this observation in practice and provide an analysis of this phenomenon by tying recovery measures to generalization bounds. In particular, we show that model recovery can be upper-bounded by the empirical risk, a model-dependent quantity and the generalization gap, reflecting our empirical findings. We further show that an efficient and provably correct distillation approach can be employed to recover the correct atoms from the over-realized model. As a result, our meta-algorithm provides dictionary estimates with consistently better recovery of the ground-truth model.

التعلم الآلي التعلم الالي

Blind Data Detection in Massive MIMO via $ell_3$-norm Maximization over the Stiefel Manifold

97 - Ye Xue , Yifei Shen , Vincent Lau 2020

Massive MIMO has been regarded as a key enabling technique for 5G and beyond networks. Nevertheless, its performance is limited by the large overhead needed to obtain the high-dimensional channel information. To reduce the huge training overhead asso ciated with conventional pilot-aided designs, we propose a novel blind data detection method by leveraging the channel sparsity and data concentration properties. Specifically, we propose a novel $ell_3$-norm-based formulation to recover the data without channel estimation. We prove that the global optimal solution to the proposed formulation can be made arbitrarily close to the transmitted data up to a phase-permutation ambiguity. We then propose an efficient parameter-free algorithm to solve the $ell_3$-norm problem and resolve the phase permutation ambiguity. We also derive the convergence rate in terms of key system parameters such as the number of transmitters and receivers, the channel noise power, and the channel sparsity level. Numerical experiments will show that the proposed scheme has superior performance with low computational complexity.

معالجة الإشارات نظرية المعلومات نظرية المعلومات