No Arabic abstract
This paper develops a new class of nonconvex regularizers for low-rank matrix recovery. Many regularizers are motivated as convex relaxations of the matrix rank function. Our new factor group-sparse regularizers are motivated as a relaxation of the number of nonzero columns in a factorization of the matrix. These nonconvex regularizers are sharper than the nuclear norm; indeed, we show they are related to Schatten-$p$ norms with arbitrarily small $0 < p leq 1$. Moreover, these factor group-sparse regularizers can be written in a factored form that enables efficient and effective nonconvex optimization; notably, the method does not use singular value decomposition. We provide generalization error bounds for low-rank matrix completion which show improved upper bounds for Schatten-$p$ norm reglarization as $p$ decreases. Compared to the max norm and the factored formulation of the nuclear norm, factor group-sparse regularizers are more efficient, accurate, and robust to the initial guess of rank. Experiments show promising performance of factor group-sparse regularization for low-rank matrix completion and robust principal component analysis.
Low rank matrix recovery is the focus of many applications, but it is a NP-hard problem. A popular way to deal with this problem is to solve its convex relaxation, the nuclear norm regularized minimization problem (NRM), which includes LASSO as a special case. There are some regularization parameter selection results for LASSO in vector case, such as screening rules, which improve the efficiency of the algorithms. However, there are no corresponding parameter selection results for NRM in matrix case. In this paper, we build up a novel rule to choose the regularization parameter for NRM under the help of duality theory. This rule claims that the regularization parameter can be easily chosen by feasible points of NRM and its dual problem, when the rank of the desired solution is no more than a given constant. In particular, we apply this idea to NRM with least square and Huber functions, and establish the easily calculated formula of regularization parameters. Finally, we report numerical results on some signal shapes, which state that our proposed rule shrinks the interval of the regularization parameter efficiently.
Given a sufficiently large amount of labeled data, the non-convex low-rank matrix recovery problem contains no spurious local minima, so a local optimization algorithm is guaranteed to converge to a global minimum starting from any initial guess. However, the actual amount of data needed by this theoretical guarantee is very pessimistic, as it must prevent spurious local minima from existing anywhere, including at adversarial locations. In contrast, prior work based on good initial guesses have more realistic data requirements, because they allow spurious local minima to exist outside of a neighborhood of the solution. In this paper, we quantify the relationship between the quality of the initial guess and the corresponding reduction in data requirements. Using the restricted isometry constant as a surrogate for sample complexity, we compute a sharp threshold number of samples needed to prevent each specific point on the optimization landscape from becoming a spurious local minimum. Optimizing the threshold over regions of the landscape, we see that for initial points around the ground truth, a linear improvement in the quality of the initial guess amounts to a constant factor improvement in the sample complexity.
We propose a sparse and low-rank tensor regression model to relate a univariate outcome to a feature tensor, in which each unit-rank tensor from the CP decomposition of the coefficient tensor is assumed to be sparse. This structure is both parsimonious and highly interpretable, as it implies that the outcome is related to the features through a few distinct pathways, each of which may only involve subsets of feature dimensions. We take a divide-and-conquer strategy to simplify the task into a set of sparse unit-rank tensor regression problems. To make the computation efficient and scalable, for the unit-rank tensor regression, we propose a stagewise estimation procedure to efficiently trace out its entire solution path. We show that as the step size goes to zero, the stagewise solution paths converge exactly to those of the corresponding regularized regression. The superior performance of our approach is demonstrated on various real-world and synthetic examples.
Low-rank matrix recovery is a fundamental problem in signal processing and machine learning. A recent very popular approach to recovering a low-rank matrix X is to factorize it as a product of two smaller matrices, i.e., X = UV^T, and then optimize over U, V instead of X. Despite the resulting non-convexity, recent results have shown that many factorized objective functions actually have benign global geometry---with no spurious local minima and satisfying the so-called strict saddle property---ensuring convergence to a global minimum for many local-search algorithms. Such results hold whenever the original objective function is restricted strongly convex and smooth. However, most of these results actually consider a modified cost function that includes a balancing regularizer. While useful for deriving theory, this balancing regularizer does not appear to be necessary in practice. In this work, we close this theory-practice gap by proving that the unaltered factorized non-convex problem, without the balancing regularizer, also has similar benign global geometry. Moreover, we also extend our theoretical results to the field of distributed optimization.
The nuclear norm and Schatten-$p$ quasi-norm of a matrix are popular rank proxies in low-rank matrix recovery. Unfortunately, computing the nuclear norm or Schatten-$p$ quasi-norm of a tensor is NP-hard, which is a pity for low-rank tensor completion (LRTC) and tensor robust principal component analysis (TRPCA). In this paper, we propose a new class of rank regularizers based on the Euclidean norms of the CP component vectors of a tensor and show that these regularizers are monotonic transformations of tensor Schatten-$p$ quasi-norm. This connection enables us to minimize the Schatten-$p$ quasi-norm in LRTC and TRPCA implicitly. The methods do not use the singular value decomposition and hence scale to big tensors. Moreover, the methods are not sensitive to the choice of initial rank and provide an arbitrarily sharper rank proxy for low-rank tensor recovery compared to nuclear norm. We provide theoretical guarantees in terms of recovery error for LRTC and TRPCA, which show relatively smaller $p$ of Schatten-$p$ quasi-norm leads to tighter error bounds. Experiments using LRTC and TRPCA on synthetic data and natural images verify the effectiveness and superiority of our methods compared to baseline methods.