ﻻ يوجد ملخص باللغة العربية
Many applications of machine learning involve the analysis of large data frames-matrices collecting heterogeneous measurements (binary, numerical, counts, etc.) across samples-with missing values. Low-rank models, as studied by Udell et al. [30], are popular in this framework for tasks such as visualization, clustering and missing value imputation. Yet, available methods with statistical guarantees and efficient optimization do not allow explicit modeling of main additive effects such as row and column, or covariate effects. In this paper, we introduce a low-rank interaction and sparse additive effects (LORIS) model which combines matrix regression on a dictionary and low-rank design, to estimate main effects and interactions simultaneously. We provide statistical guarantees in the form of upper bounds on the estimation error of both components. Then, we introduce a mixed coordinate gradient descent (MCGD) method which provably converges sub-linearly to an optimal solution and is computationally efficient for large scale data sets. We show on simulated and survey data that the method has a clear advantage over current practices, which consist in dealing separately with additive effects in a preprocessing step.
Many recent advances in machine learning are driven by a challenging trifecta: large data size $N$; high dimensions; and expensive algorithms. In this setting, cross-validation (CV) serves as an important tool for model assessment. Recent advances in
Low rank tensor ring model is powerful for image completion which recovers missing entries in data acquisition and transformation. The recently proposed tensor ring (TR) based completion algorithms generally solve the low rank optimization problem by
Missing value problem in spatiotemporal traffic data has long been a challenging topic, in particular for large-scale and high-dimensional data with complex missing mechanisms and diverse degrees of missingness. Recent studies based on tensor nuclear
Missing values challenge data analysis because many supervised and unsupervised learning methods cannot be applied directly to incomplete data. Matrix completion based on low-rank assumptions are very powerful solution for dealing with missing values
This paper proposes a fast and accurate method for sparse regression in the presence of missing data. The underlying statistical model encapsulates the low-dimensional structure of the incomplete data matrix and the sparsity of the regression coeffic