No Arabic abstract
Phaseless diffraction measurements recorded by a CCD detector are often affected by Poisson noise. In this paper, we propose a dictionary learning model by employing patches based sparsity to denoise Poisson phaseless measurement. The model consists of three terms: (i) A representation term by an orthogonal dictionary, (ii) an $L^0$ pseudo norm of coefficient matrix, and (iii) a Kullback-Leibler divergence to fit phaseless Poisson data. Fast Alternating Minimization Method (AMM) and Proximal Alternating Linearized Minimization (PALM) are adopted to solve the established model with convergence guarantee, and especially global convergence for PALM is derived. The subproblems for two algorithms have fast solvers, and indeed, the solutions for the sparse coding and dictionary updating both have closed forms due to the orthogonality of learned dictionaries. Numerical experiments for phase retrieval using coded diffraction and ptychographic patterns are performed to show the efficiency and robustness of proposed methods, which, by preserving texture features, produce visually and quantitatively improved denoised images compared with other phase retrieval algorithms without regularization and local sparsity promoting algorithms.
We propose a general framework to recover underlying images from noisy phaseless diffraction measurements based on the alternating directional method of multipliers and the plug-and-play technique. The algorithm consists of three-step iterations: (i) Solving a generalized least square problem with the maximum a posteriori (MAP) estimate of the noise, (ii) Gaussian denoising and (iii) updating the multipliers. The denoising step utilizes higher order filters such as total generalized variation and nonlocal sparsity based filters including nonlocal mean (NLM) and Block-matching and 3D filtering (BM3D) filters. The multipliers are updated by a symmetric technique to increase convergence speed. The proposed method with low computational complexity is provided with theoretical convergence guarantee, and it enables recovering images with sharp edges, clean background and repetitive features from noisy phaseless measurements. Numerous numerical experiments for Fourier phase retrieval (PR) as coded diffraction and ptychographic patterns are performed to verify the convergence and efficiency, showing that our proposed method outperforms the state-of-art PR algorithms without any regularization and those with total variational regularization.
This paper considers the fundamental problem of learning a complete (orthogonal) dictionary from samples of sparsely generated signals. Most existing methods solve the dictionary (and sparse representations) based on heuristic algorithms, usually without theoretical guarantees for either optimality or complexity. The recent $ell^1$-minimization based methods do provide such guarantees but the associated algorithms recover the dictionary one column at a time. In this work, we propose a new formulation that maximizes the $ell^4$-norm over the orthogonal group, to learn the entire dictionary. We prove that under a random data model, with nearly minimum sample complexity, the global optima of the $ell^4$ norm are very close to signed permutations of the ground truth. Inspired by this observation, we give a conceptually simple and yet effective algorithm based on matching, stretching, and projection (MSP). The algorithm provably converges locally at a superlinear (cubic) rate and cost per iteration is merely an SVD. In addition to strong theoretical guarantees, experiments show that the new algorithm is significantly more efficient and effective than existing methods, including KSVD and $ell^1$-based methods. Preliminary experimental results on mixed real imagery data clearly demonstrate advantages of so learned dictionary over classic PCA bases.
Dictionary learning is a widely used unsupervised learning method in signal processing and machine learning. Most existing works of dictionary learning are in an offline manner. There are mainly two offline ways for dictionary learning. One is to do an alternative optimization of both the dictionary and the sparse code; the other way is to optimize the dictionary by restricting it over the orthogonal group. The latter one is called orthogonal dictionary learning which has a lower complexity implementation, hence, it is more favorable for lowcost devices. However, existing schemes on orthogonal dictionary learning only work with batch data and can not be implemented online, which is not applicable for real-time applications. This paper proposes a novel online orthogonal dictionary scheme to dynamically learn the dictionary from streaming data without storing the historical data. The proposed scheme includes a novel problem formulation and an efficient online algorithm design with convergence analysis. In the problem formulation, we relax the orthogonal constraint to enable an efficient online algorithm. In the algorithm design, we propose a new Frank-Wolfe-based online algorithm with a convergence rate of O(ln t/t^(1/4)). The convergence rate in terms of key system parameters is also derived. Experiments with synthetic data and real-world sensor readings demonstrate the effectiveness and efficiency of the proposed online orthogonal dictionary learning scheme.
Co-occurrence statistics based word embedding techniques have proved to be very useful in extracting the semantic and syntactic representation of words as low dimensional continuous vectors. In this work, we discovered that dictionary learning can open up these word vectors as a linear combination of more elementary word factors. We demonstrate many of the learned factors have surprisingly strong semantic or syntactic meaning corresponding to the factors previously identified manually by human inspection. Thus dictionary learning provides a powerful visualization tool for understanding word embedding representations. Furthermore, we show that the word factors can help in identifying key semantic and syntactic differences in word analogy tasks and improve upon the state-of-the-art word embedding techniques in these tasks by a large margin.
Seismic data quality is vital to geophysical applications, so methods of data recovery, including denoising and interpolation, are common initial steps in the seismic data processing flow. We present a method to perform simultaneous interpolation and denoising, which is based on double-sparsity dictionary learning. This extends previous work that was for denoising only. The original double sparsity dictionary learning algorithm is modified to track the traces with missing data by defining a masking operator that is integrated into the sparse representation of the dictionary. A weighted low-rank approximation algorithm is adopted to handle the dictionary updating as a sparse recovery optimization problem constrained by the masking operator. Compared to traditional sparse transforms with fixed dictionaries that lack the ability to adapt to complex data structures, the double-sparsity dictionary learning method learns the signal adaptively from selected patches of the corrupted seismic data while preserving compact forward and inverse transform operators. Numerical experiments on synthetic seismic data indicate that this new method preserves more subtle features in the dataset without introducing pseudo-Gibbs artifacts when compared to other directional multiscale transform methods such as curvelets.