An algebraic-geometric approach for linear regression without correspondences

352 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Manolis Tsakiris

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Manolis C. Tsakiris - Liangzu Peng - Aldo Conca

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Linear regression without correspondences is the problem of performing a linear regression fit to a dataset for which the correspondences between the independent samples and the observations are unknown. Such a problem naturally arises in diverse domains such as computer vision, data mining, communications and biology. In its simplest form, it is tantamount to solving a linear system of equations, for which the entries of the right hand side vector have been permuted. This type of data corruption renders the linear regression task considerably harder, even in the absence of other corruptions, such as noise, outliers or missing entries. Existing methods are either applicable only to noiseless data or they are very sensitive to initialization or they work only for partially shuffled data. In this paper we address these issues via an algebraic geometric approach, which uses symmetric polynomials to extract permutation-invariant constraints that the parameters $xi^* in Re^n$ of the linear regression model must satisfy. This naturally leads to a polynomial system of $n$ equations in $n$ unknowns, which contains $xi^*$ in its root locus. Using the machinery of algebraic geometry we prove that as long as the independent samples are generic, this polynomial system is always consistent with at most $n!$ complex roots, regardless of any type of corruption inflicted on the observations. The algorithmic implication of this fact is that one can always solve this polynomial system and use its most suitable root as initialization to the Expectation Maximization algorithm. To the best of our knowledge, the resulting method is the first working solution for small values of $n$ able to handle thousands of fully shuffled noisy observations in milliseconds.

قيم البحث

64 - Liangzu Peng , Manolis C. Tsakiris 2020

Linear regression without correspondences concerns the recovery of a signal in the linear regression setting, where the correspondences between the observations and the linear functionals are unknown. The associated maximum likelihood function is NP- hard to compute when the signal has dimension larger than one. To optimize this objective function we reformulate it as a concave minimization problem, which we solve via branch-and-bound. This is supported by a computable search space to branch, an effective lower bounding scheme via convex envelope minimization and a refined upper bound, all naturally arising from the concave minimization reformulation. The resulting algorithm outperforms state-of-the-art methods for fully shuffled data and remains tractable for up to $8$-dimensional signals, an untouched regime in prior work.

نظرية المعلومات التعلم الآلي نظرية المعلومات

A Hypergradient Approach to Robust Regression without Correspondence

75 - Yujia Xie , Yixiu Mao , Simiao Zuo 2020

We consider a variant of regression problem, where the correspondence between input and output data is not available. Such shuffled data is commonly observed in many real world problems. Taking flow cytometry as an example, the measuring instruments may not be able to maintain the correspondence between the samples and the measurements. Due to the combinatorial nature of the problem, most existing methods are only applicable when the sample size is small, and limited to linear regression models. To overcome such bottlenecks, we propose a new computational framework -- ROBOT -- for the shuffled regression problem, which is applicable to large data and complex nonlinear models. Specifically, we reformulate the regression without correspondence as a continuous optimization problem. Then by exploiting the interaction between the regression model and the data correspondence, we develop a hypergradient approach based on differentiable programming techniques. Such a hypergradient approach essentially views the data correspondence as an operator of the regression, and therefore allows us to find a better descent direction for the model parameter by differentiating through the data correspondence. ROBOT can be further extended to the inexact correspondence setting, where there may not be an exact alignment between the input and output data. Thorough numerical experiments show that ROBOT achieves better performance than existing methods in both linear and nonlinear regression tasks, including real-world applications such as flow cytometry and multi-object tracking.

التعلم الآلي التعلم الالي

ADMM-SOFTMAX : An ADMM Approach for Multinomial Logistic Regression

169 - Samy Wu Fung , Sanna Tyrvainen , Lars Ruthotto 2019

We present ADMM-Softmax, an alternating direction method of multipliers (ADMM) for solving multinomial logistic regression (MLR) problems. Our method is geared toward supervised classification tasks with many examples and features. It decouples the n onlinear optimization problem in MLR into three steps that can be solved efficiently. In particular, each iteration of ADMM-Softmax consists of a linear least-squares problem, a set of independent small-scale smooth, convex problems, and a trivial dual variable update. Solution of the least-squares problem can be be accelerated by pre-computing a factorization or preconditioner, and the separability in the smooth, convex problem can be easily parallelized across examples. For two image classification problems, we demonstrate that ADMM-Softmax leads to improved generalization compared to a Newton-Krylov, a quasi Newton, and a stochastic gradient descent method.

التعلم الآلي التعلم الالي

Meta-learning for mixed linear regression

432 - Weihao Kong , Raghav Somani , Zhao Song 2020

In modern supervised learning, there are a large number of tasks, but many of them are associated with only a small amount of labeled data. These include data from medical image processing and robotic interaction. Even though each individual task can not be meaningfully trained in isolation, one seeks to meta-learn across the tasks from past experiences by exploiting some similarities. We study a fundamental question of interest: When can abundant tasks with small data compensate for lack of tasks with big data? We focus on a canonical scenario where each task is drawn from a mixture of $k$ linear regressions, and identify sufficient conditions for such a graceful exchange to hold; The total number of examples necessary with only small data tasks scales similarly as when big data tasks are available. To this end, we introduce a novel spectral approach and show that we can efficiently utilize small data tasks with the help of $tildeOmega(k^{3/2})$ medium data tasks each with $tildeOmega(k^{1/2})$ examples.

التعلم الآلي التعلم الالي

Improved PAC-Bayesian Bounds for Linear Regression

192 - Vera Shalaeva 2019

In this paper, we improve the PAC-Bayesian error bound for linear regression derived in Germain et al. [10]. The improvements are twofold. First, the proposed error bound is tighter, and converges to the generalization loss with a well-chosen tempera ture parameter. Second, the error bound also holds for training data that are not independently sampled. In particular, the error bound applies to certain time series generated by well-known classes of dynamical models, such as ARX models.

التعلم الآلي التعلم الالي