ﻻ يوجد ملخص باللغة العربية
This paper proposes a method for solving multivariate regression and classification problems using piecewise linear predictors over a polyhedral partition of the feature space. The resulting algorithm that we call PARC (Piecewise Affine Regression and Classification) alternates between (i) solving ridge regression problems for numeric targets, softmax regression problems for categorical targets, and either softmax regression or cluster centroid computation for piecewise linear separation, and (ii) assigning the training points to different clusters on the basis of a criterion that balances prediction accuracy and piecewise-linear separability. We prove that PARC is a block-coordinate descent algorithm that optimizes a suitably constructed objective function, and that it converges in a finite number of steps to a local minimum of that function. The accuracy of the algorithm is extensively tested numerically on synthetic and real-world datasets, showing that the approach provides an extension of linear regression/classification that is particularly useful when the obtained predictor is used as part of an optimization model. A Python implementation of the algorithm described in this paper is available at http://cse.lab.imtlucca.it/~bemporad/parc .
In modern supervised learning, there are a large number of tasks, but many of them are associated with only a small amount of labeled data. These include data from medical image processing and robotic interaction. Even though each individual task can
In this paper, we improve the PAC-Bayesian error bound for linear regression derived in Germain et al. [10]. The improvements are twofold. First, the proposed error bound is tighter, and converges to the generalization loss with a well-chosen tempera
Understanding the loss surface of a neural network is fundamentally important to the understanding of deep learning. This paper presents how piecewise linear activation functions substantially shape the loss surfaces of neural networks. We first prov
With the dramatic increase of dimensions in the data representation, extracting latent low-dimensional features becomes of the utmost importance for efficient classification. Aiming at the problems of unclear margin representation and difficulty in r
Given a linear regression setting, Iterative Least Trimmed Squares (ILTS) involves alternating between (a) selecting the subset of samples with lowest current loss, and (b) re-fitting the linear model only on that subset. Both steps are very fast and