Twin Neural Network Regression is a Semi-Supervised Regression Algorithm

79 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sebastian Johann Wetzel

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sebastian J. Wetzel - Roger G. Melko - Isaac Tamblyn

التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Twin neural network regression (TNNR) is a semi-supervised regression algorithm, it can be trained on unlabelled data points as long as other, labelled anchor data points, are present. TNNR is trained to predict differences between the target values of two different data points rather than the targets themselves. By ensembling predicted differences between the targets of an unseen data point and all training data points, it is possible to obtain a very accurate prediction for the original regression problem. Since any loop of predicted differences should sum to zero, loops can be supplied to the training data, even if the data points themselves within loops are unlabelled. Semi-supervised training improves TNNR performance, which is already state of the art, significantly.

قيم البحث

اقرأ أيضاً

Semi-supervised Active Regression

422 - Fnu Devvrit , Nived Rajaraman , Pranjal Awasthi 2021

Labelled data often comes at a high cost as it may require recruiting human labelers or running costly experiments. At the same time, in many practical scenarios, one already has access to a partially labelled, potentially biased dataset that can hel p with the learning task at hand. Motivated by such settings, we formally initiate a study of $semi-supervised$ $active$ $learning$ through the frame of linear regression. In this setting, the learner has access to a dataset $X in mathbb{R}^{(n_1+n_2) times d}$ which is composed of $n_1$ unlabelled examples that an algorithm can actively query, and $n_2$ examples labelled a-priori. Concretely, denoting the true labels by $Y in mathbb{R}^{n_1 + n_2}$, the learners objective is to find $widehat{beta} in mathbb{R}^d$ such that, begin{equation} | X widehat{beta} - Y |_2^2 le (1 + epsilon) min_{beta in mathbb{R}^d} | X beta - Y |_2^2 end{equation} while making as few additional label queries as possible. In order to bound the label queries, we introduce an instance dependent parameter called the reduced rank, denoted by $R_X$, and propose an efficient algorithm with query complexity $O(R_X/epsilon)$. This result directly implies improved upper bounds for two important special cases: (i) active ridge regression, and (ii) active kernel ridge regression, where the reduced-rank equates to the statistical dimension, $sd_lambda$ and effective dimension, $d_lambda$ of the problem respectively, where $lambda ge 0$ denotes the regularization parameter. For active ridge regression we also prove a matching lower bound of $O(sd_lambda / epsilon)$ on the query complexity of any algorithm. This subsumes prior work that only considered the unregularized case, i.e., $lambda = 0$.

التعلم الآلي

Semi-Supervised Ordinal Regression Based on Empirical Risk Minimization

192 - Taira Tsuchiya , Nontawat Charoenphakdee , Issei Sato 2019

Ordinal regression is aimed at predicting an ordinal class label. In this paper, we consider its semi-supervised formulation, in which we have unlabeled data along with ordinal-labeled data to train an ordinal regressor. There are several metrics to evaluate the performance of ordinal regression, such as the mean absolute error, mean zero-one error, and mean squared error. However, the existing studies do not take the evaluation metric into account, have a restriction on the model choice, and have no theoretical guarantee. To overcome these problems, we propose a novel generic framework for semi-supervised ordinal regression based on the empirical risk minimization principle that is applicable to optimizing all of the metrics mentioned above. Besides, our framework has flexible choices of models, surrogate losses, and optimization algorithms without the common geometric assumption on unlabeled data such as the cluster assumption or manifold assumption. We further provide an estimation error bound to show that our risk estimator is consistent. Finally, we conduct experiments to show the usefulness of our framework.

التعلم الآلي التعلم الالي

A Neural Network for Semi-Supervised Learning on Manifolds

61 - Alexander Genkin , Anirvan M. Sengupta , Dmitri Chklovskii 2019

Semi-supervised learning algorithms typically construct a weighted graph of data points to represent a manifold. However, an explicit graph representation is problematic for neural networks operating in the online setting. Here, we propose a feed-for ward neural network capable of semi-supervised learning on manifolds without using an explicit graph representation. Our algorithm uses channels that represent localities on the manifold such that correlations between channels represent manifold structure. The proposed neural network has two layers. The first layer learns to build a representation of low-dimensional manifolds in the input data as proposed recently in [8]. The second learns to classify data using both occasional supervision and similarity of the manifold representation of the data. The channel carrying label information for the second layer is assumed to be silent most of the time. Learning in both layers is Hebbian, making our network design biologically plausible. We experimentally demonstrate the effect of semi-supervised learning on non-trivial manifolds.

التعلم الآلي التعلم الالي

Semi-supervised structured output prediction by local linear regression and sub-gradient descent

117 - Ru-Ze Liang , Wei Xie , Weizhi Li 2016

We propose a novel semi-supervised structured output prediction method based on local linear regression in this paper. The existing semi-supervise structured output prediction methods learn a global predictor for all the data points in a data set, wh ich ignores the differences of local distributions of the data set, and the effects to the structured output prediction. To solve this problem, we propose to learn the missing structured outputs and local predictors for neighborhoods of different data points jointly. Using the local linear regression strategy, in the neighborhood of each data point, we propose to learn a local linear predictor by minimizing both the complexity of the predictor and the upper bound of the structured prediction loss. The minimization problem is solved by sub-gradient descent algorithms. We conduct experiments over two benchmark data sets, and the results show the advantages of the proposed method.

التعلم الآلي

Neural Symbolic Regression that Scales

290 - Luca Biggio , Tommaso Bendinelli , Alexander Neitz 2021

Symbolic equations are at the core of scientific discovery. The task of discovering the underlying equation from a set of input-output pairs is called symbolic regression. Traditionally, symbolic regression methods use hand-designed strategies that d o not improve with experience. In this paper, we introduce the first symbolic regression method that leverages large scale pre-training. We procedurally generate an unbounded set of equations, and simultaneously pre-train a Transformer to predict the symbolic equation from a corresponding set of input-output-pairs. At test time, we query the model on a new set of points and use its output to guide the search for the equation. We show empirically that this approach can re-discover a set of well-known physical equations, and that it improves over time with more data and compute.

التعلم الآلي