Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Complete Dictionary Learning via $ell_p$-norm Maximization

83 0 0.0 ( 0 )

Download Cite

Added by Yifei Shen

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Yifei Shen - Ye Xue - Jun Zhang

Machine Learning Information Theory Signal Processing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Dictionary learning is a classic representation learning method that has been widely applied in signal processing and data analytics. In this paper, we investigate a family of $ell_p$-norm ($p>2,p in mathbb{N}$) maximization approaches for the complete dictionary learning problem from theoretical and algorithmic aspects. Specifically, we prove that the global maximizers of these formulations are very close to the true dictionary with high probability, even when Gaussian noise is present. Based on the generalized power method (GPM), an efficient algorithm is then developed for the $ell_p$-based formulations. We further show the efficacy of the developed algorithm: for the population GPM algorithm over the sphere constraint, it first quickly enters the neighborhood of a global maximizer, and then converges linearly in this region. Extensive experiments will demonstrate that the $ell_p$-based approaches enjoy a higher computational efficiency and better robustness than conventional approaches and $p=3$ performs the best.

rate research

Complete Dictionary Learning via $ell^4$-Norm Maximization over the Orthogonal Group

136 - Yuexiang Zhai , Zitong Yang , Zhenyu Liao 2019

This paper considers the fundamental problem of learning a complete (orthogonal) dictionary from samples of sparsely generated signals. Most existing methods solve the dictionary (and sparse representations) based on heuristic algorithms, usually without theoretical guarantees for either optimality or complexity. The recent $ell^1$-minimization based methods do provide such guarantees but the associated algorithms recover the dictionary one column at a time. In this work, we propose a new formulation that maximizes the $ell^4$-norm over the orthogonal group, to learn the entire dictionary. We prove that under a random data model, with nearly minimum sample complexity, the global optima of the $ell^4$ norm are very close to signed permutations of the ground truth. Inspired by this observation, we give a conceptually simple and yet effective algorithm based on matching, stretching, and projection (MSP). The algorithm provably converges locally at a superlinear (cubic) rate and cost per iteration is merely an SVD. In addition to strong theoretical guarantees, experiments show that the new algorithm is significantly more efficient and effective than existing methods, including KSVD and $ell^1$-based methods. Preliminary experimental results on mixed real imagery data clearly demonstrate advantages of so learned dictionary over classic PCA bases.

Machine Learning Signal Processing Computation

Simple and practical algorithms for $ell_p$-norm low-rank approximation

226 - Anastasios Kyrillidis 2018

We propose practical algorithms for entrywise $ell_p$-norm low-rank approximation, for $p = 1$ or $p = infty$. The proposed framework, which is non-convex and gradient-based, is easy to implement and typically attains better approximations, faster, than state of the art. From a theoretical standpoint, we show that the proposed scheme can attain $(1 + varepsilon)$-OPT approximations. Our algorithms are not hyperparameter-free: they achieve the desiderata only assuming algorithms hyperparameters are known a priori---or are at least approximable. I.e., our theory indicates what problem quantities need to be known, in order to get a good solution within polynomial time, and does not contradict to recent inapproximabilty results, as in [46].

Machine Learning Information Theory Numerical Analysis

Blind Data Detection in Massive MIMO via $ell_3$-norm Maximization over the Stiefel Manifold

97 - Ye Xue , Yifei Shen , Vincent Lau 2020

Massive MIMO has been regarded as a key enabling technique for 5G and beyond networks. Nevertheless, its performance is limited by the large overhead needed to obtain the high-dimensional channel information. To reduce the huge training overhead associated with conventional pilot-aided designs, we propose a novel blind data detection method by leveraging the channel sparsity and data concentration properties. Specifically, we propose a novel $ell_3$-norm-based formulation to recover the data without channel estimation. We prove that the global optimal solution to the proposed formulation can be made arbitrarily close to the transmitted data up to a phase-permutation ambiguity. We then propose an efficient parameter-free algorithm to solve the $ell_3$-norm problem and resolve the phase permutation ambiguity. We also derive the convergence rate in terms of key system parameters such as the number of transmitters and receivers, the channel noise power, and the channel sparsity level. Numerical experiments will show that the proposed scheme has superior performance with low computational complexity.

Signal Processing Information Theory Information Theory

Iterative Refinement for $ell_p$-norm Regression

114 - Deeksha Adil , Rasmus Kyng , Richard Peng 2019

We give improved algorithms for the $ell_{p}$-regression problem, $min_{x} |x|_{p}$ such that $A x=b,$ for all $p in (1,2) cup (2,infty).$ Our algorithms obtain a high accuracy solution in $tilde{O}_{p}(m^{frac{|p-2|}{2p + |p-2|}}) le tilde{O}_{p}(m^{frac{1}{3}})$ iterations, where each iteration requires solving an $m times m$ linear system, $m$ being the dimension of the ambient space. By maintaining an approximate inverse of the linear systems that we solve in each iteration, we give algorithms for solving $ell_{p}$-regression to $1 / text{poly}(n)$ accuracy that run in time $tilde{O}_p(m^{max{omega, 7/3}}),$ where $omega$ is the matrix multiplication constant. For the current best value of $omega > 2.37$, we can thus solve $ell_{p}$ regression as fast as $ell_{2}$ regression, for all constant $p$ bounded away from $1.$ Our algorithms can be combined with fast graph Laplacian linear equation solvers to give minimum $ell_{p}$-norm flow / voltage solutions to $1 / text{poly}(n)$ accuracy on an undirected graph with $m$ edges in $tilde{O}_{p}(m^{1 + frac{|p-2|}{2p + |p-2|}}) le tilde{O}_{p}(m^{frac{4}{3}})$ time. For sparse graphs and for matrices with similar dimensions, our iteration counts and running times improve on the $p$-norm regression algorithm by [Bubeck-Cohen-Lee-Li STOC`18] and general-purpose convex optimization algorithms. At the core of our algorithms is an iterative refinement scheme for $ell_{p}$-norms, using the smoothed $ell_{p}$-norms introduced in the work of Bubeck et al. Given an initial solution, we construct a problem that seeks to minimize a quadratically-smoothed $ell_{p}$ norm over a subspace, such that a crude solution to this problem allows us to improve the initial solution by a constant factor, leading to algorithms with fast convergence.

Data Structures and Algorithms Numerical Analysis Optimization and Control

Graph Representation Learning via Graphical Mutual Information Maximization

162 - Zhen Peng , Wenbing Huang , Minnan Luo 2020

The richness in the content of various information networks such as social networks and communication networks provides the unprecedented potential for learning high-quality expressive representations without external supervision. This paper investigates how to preserve and extract the abundant information from graph-structured data into embedding space in an unsupervised manner. To this end, we propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations. GMI generalizes the idea of conventional mutual information computations from vector space to the graph domain where measuring mutual information from two aspects of node features and topological structure is indispensable. GMI exhibits several benefits: First, it is invariant to the isomorphic transformation of input graphs---an inevitable constraint in many existing graph representation learning algorithms; Besides, it can be efficiently estimated and maximized by current mutual information estimation methods such as MINE; Finally, our theoretical analysis confirms its correctness and rationality. With the aid of GMI, we develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder. Considerable experiments on transductive as well as inductive node classification and link prediction demonstrate that our method outperforms state-of-the-art unsupervised counterparts, and even sometimes exceeds the performance of supervised ones.

Machine Learning Artificial Intelligence Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Complete Dictionary Learning via $ell_p$-norm Maximization

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions