No Arabic abstract
Most multilayer least squares (LS)-based neural networks are structured with two separate stages: unsupervised feature encoding and supervised pattern classification. Once the unsupervised learning is finished, the latent encoding would be fixed without supervised fine-tuning. However, in complex tasks such as handling the ImageNet dataset, there are often many more clues that can be directly encoded, while the unsupervised learning, by definition cannot know exactly what is useful for a certain task. This serves as the motivation to retrain the latent space representations to learn some clues that unsupervised learning has not yet learned. In particular, the error matrix from the output layer is pulled back to each hidden layer, and the parameters of the hidden layer are recalculated with Moore-Penrose (MP) inverse for more generalized representations. In this paper, a recomputation-based multilayer network using MP inverse (RML-MP) is developed. A sparse RML-MP (SRML-MP) model to boost the performance of RML-MP is then proposed. The experimental results with varying training samples (from 3 K to 1.8 M) show that the proposed models provide better generalization performance than most representation learning algorithms.
Trajectory prediction plays a pivotal role in the field of intelligent vehicles. It currently suffers from several challenges,e.g., accumulative error in rollout process and weak adaptability in various scenarios. This paper proposes a parametric-learning recursive least squares (RLS) estimation based on deep neural network for trajectory prediction. We design a flexible plug-in module which can be readily implanted into rollout approaches. Goal points are proposed to capture the long-term prediction stability from the global perspective. We carried experiments out on the NGSIM dataset. The promising results indicate that our method could improve rollout trajectory prediction methods effectively.
Given a linear regression setting, Iterative Least Trimmed Squares (ILTS) involves alternating between (a) selecting the subset of samples with lowest current loss, and (b) re-fitting the linear model only on that subset. Both steps are very fast and simple. In this paper we analyze ILTS in the setting of mixed linear regression with corruptions (MLR-C). We first establish deterministic conditions (on the features etc.) under which the ILTS iterate converges linearly to the closest mixture component. We also provide a global algorithm that uses ILTS as a subroutine, to fully solve mixed linear regressions with corruptions. We then evaluate it for the widely studied setting of isotropic Gaussian features, and establish that we match or better existing results in terms of sample complexity. Finally, we provide an ODE analysis for a gradient-descent variant of ILTS that has optimal time complexity. Our results provide initial theoretical evidence that iteratively fitting to the best subset of samples -- a potentially widely applicable idea -- can provably provide state of the art performance in bad training data settings.
This paper proposes a new estimator for selecting weights to average over least squares estimates obtained from a set of models. Our proposed estimator builds on the Mallows model average (MMA) estimator of Hansen (2007), but, unlike MMA, simultaneously controls for location bias and regression error through a common constant. We show that our proposed estimator-- the mean-shift Mallows model average (MSA) estimator-- is asymptotically optimal to the original MMA estimator in terms of mean squared error. A simulation study is presented, where we show that our proposed estimator uniformly outperforms the MMA estimator.
This paper studies an unsupervised deep learning-based numerical approach for solving partial differential equations (PDEs). The approach makes use of the deep neural network to approximate solutions of PDEs through the compositional construction and employs least-squares functionals as loss functions to determine parameters of the deep neural network. There are various least-squares functionals for a partial differential equation. This paper focuses on the so-called first-order system least-squares (FOSLS) functional studied in [3], which is based on a first-order system of scalar second-order elliptic PDEs. Numerical results for second-order elliptic PDEs in one dimension are presented.
Recursive least squares (RLS) algorithms were once widely used for training small-scale neural networks, due to their fast convergence. However, previous RLS algorithms are unsuitable for training deep neural networks (DNNs), since they have high computational complexity and too many preconditions. In this paper, to overcome these drawbacks, we propose three novel RLS optimization algorithms for training feedforward neural networks, convolutional neural networks and recurrent neural networks (including long short-term memory networks), by using the error backpropagation and our average-approximation RLS method, together with the equivalent gradients of the linear least squares loss function with respect to the linear outputs of hidden layers. Compared with previous RLS optimization algorithms, our algorithms are simple and elegant. They can be viewed as an improved stochastic gradient descent (SGD) algorithm, which uses the inverse autocorrelation matrix of each layer as the adaptive learning rate. Their time and space complexities are only several times those of SGD. They only require the loss function to be the mean squared error and the activation function of the output layer to be invertible. In fact, our algorithms can be also used in combination with other first-order optimization algorithms without requiring these two preconditions. In addition, we present two improved methods for our algorithms. Finally, we demonstrate their effectiveness compared to the Adam algorithm on MNIST, CIFAR-10 and IMDB datasets, and investigate the influences of their hyperparameters experimentally.