Geometry of Factored Nuclear Norm Regularization

93 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Qiuwei Li

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Qiuwei Li - Zhihui Zhu - Gongguo Tang

التحليل العددي نظرية المعلومات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This work investigates the geometry of a nonconvex reformulation of minimizing a general convex loss function $f(X)$ regularized by the matrix nuclear norm $|X|_$. Nuclear-norm regularized matrix inverse problems are at the heart of many applications in machine learning, signal processing, and control. The statistical performance of nuclear norm regularization has been studied extensively in literature using convex analysis techniques. Despite its optimal performance, the resulting optimization has high computational complexity when solved using standard or even tailored fast convex solvers. To develop faster and more scalable algorithms, we follow the proposal of Burer-Monteiro to factor the matrix variable $X$ into the product of two smaller rectangular matrices $X=UV^T$ and also replace the nuclear norm $|X|_$ with $(|U|_F^2+|V|_F^2)/2$. In spite of the nonconvexity of the factored formulation, we prove that when the convex loss function $f(X)$ is $(2r,4r)$-restricted well-conditioned, each critical point of the factored problem either corresponds to the optimal solution $X^star$ of the original convex optimization or is a strict saddle point where the Hessian matrix has a strictly negative eigenvalue. Such a geometric structure of the factored formulation allows many local search algorithms to converge to the global optimum with random initializations.

قيم البحث

415 - David F. Gleich , Lek-Heng Lim 2011

The process of rank aggregation is intimately intertwined with the structure of skew-symmetric matrices. We apply recent advances in the theory and algorithms of matrix completion to skew-symmetric matrices. This combination of ideas produces a new m ethod for ranking a set of items. The essence of our idea is that a rank aggregation describes a partially filled skew-symmetric matrix. We extend an algorithm for matrix completion to handle skew-symmetric data and use that to extract ranks for each item. Our algorithm applies to both pairwise comparison and rating data. Because it is based on matrix completion, it is robust to both noise and incomplete data. We show a formal recovery result for the noiseless case and present a detailed study of the algorithm on synthetic data and Netflix ratings.

التحليل العددي

On Dropout and Nuclear Norm Regularization

130 - Poorya Mianjy , Raman Arora 2019

We give a formal and complete characterization of the explicit regularizer induced by dropout in deep linear networks with squared loss. We show that (a) the explicit regularizer is composed of an $ell_2$-path regularizer and other terms that are als o re-scaling invariant, (b) the convex envelope of the induced regularizer is the squared nuclear norm of the network map, and (c) for a sufficiently large dropout rate, we characterize the global optima of the dropout objective. We validate our theoretical findings with empirical results.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Multi-dimensional imaging data recovery via minimizing the partial sum of tubal nuclear norm

94 - Tai-Xiang Jiang , Ting-Zhu Huang , Xi-Le Zhao 2017

In this paper, we investigate tensor recovery problems within the tensor singular value decomposition (t-SVD) framework. We propose the partial sum of the tubal nuclear norm (PSTNN) of a tensor. The PSTNN is a surrogate of the tensor tubal multi-rank . We build two PSTNN-based minimization models for two typical tensor recovery problems, i.e., the tensor completion and the tensor principal component analysis. We give two algorithms based on the alternating direction method of multipliers (ADMM) to solve proposed PSTNN-based tensor recovery models. Experimental results on the synthetic data and real-world data reveal the superior of the proposed PSTNN.

التحليل العددي الرؤية الحاسوبية وتمييز الأنماط

Efficient and Practical Stochastic Subgradient Descent for Nuclear Norm Regularization

365 - Haim Avron 2012

We describe novel subgradient methods for a broad class of matrix optimization problems involving nuclear norm regularization. Unlike existing approaches, our method executes very cheap iterations by combining low-rank stochastic subgradients with ef ficient incremental SVD updates, made possible by highly optimized and parallelizable dense linear algebra operations on small matrices. Our practical algorithms always maintain a low-rank factorization of iterates that can be conveniently held in memory and efficiently multiplied to generate predictions in matrix completion settings. Empirical comparisons confirm that our approach is highly competitive with several recently proposed state-of-the-art solvers for such problems.

التعلم الآلي التعلم الالي

Mixed-norm Regularization for Brain Decoding

351 - Remi Flamary 2014

This work investigates the use of mixed-norm regularization for sensor selection in Event-Related Potential (ERP) based Brain-Computer Interfaces (BCI). The classification problem is cast as a discriminative optimization framework where sensor select ion is induced through the use of mixed-norms. This framework is extended to the multi-task learning situation where several similar classification tasks related to different subjects are learned simultaneously. In this case, multi-task learning helps in leveraging data scarcity issue yielding to more robust classifiers. For this purpose, we have introduced a regularizer that induces both sensor selection and classifier similarities. The different regularization approaches are compared on three ERP datasets showing the interest of mixed-norm regularization in terms of sensor selection. The multi-task approaches are evaluated when a small number of learning examples are available yielding to significant performance improvements especially for subjects performing poorly.

التعلم الآلي