ﻻ يوجد ملخص باللغة العربية
Many classification problems involve data instances that are interlinked with each other, such as webpages connected by hyperlinks. Techniques for collective classification (CC) often increase accuracy for such data graphs, but usually require a fully-labeled training graph. In contrast, we examine how to improve the semi-supervised learning of CC models when given only a sparsely-labeled graph, a common situation. We first describe how to use novel combinations of classifiers to exploit the different characteristics of the relational features vs. the non-relational features. We also extend the ideas of label regularization to such hybrid classifiers, enabling them to leverage the unlabeled data to bias the learning process. We find that these techniques, which are efficient and easy to implement, significantly increase accuracy on three real datasets. In addition, our results explain conflicting findings from prior related studies.
Self-training is a standard approach to semi-supervised learning where the learners own predictions on unlabeled data are used as supervision during training. In this paper, we reinterpret this label assignment process as an optimal transportation pr
Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network by softeni
Label Smoothing (LS) is an effective regularizer to improve the generalization of state-of-the-art deep models. For each training sample the LS strategy smooths the one-hot encoded training signal by distributing its distribution mass over the non gr
Graph neural networks (GNNs) achieve remarkable success in graph-based semi-supervised node classification, leveraging the information from neighboring nodes to improve the representation learning of target node. The success of GNNs at node classific
A common classification task situation is where one has a large amount of data available for training, but only a small portion is annotated with class labels. The goal of semi-supervised training, in this context, is to improve classification accura