Semi-supervised deep learning based on label propagation in a 2D embedded space


Abstract in English

While convolutional neural networks need large labeled sets for training images, expert human supervision of such datasets can be very laborious. Proposed solutions propagate labels from a small set of supervised images to a large set of unsupervised ones to obtain sufficient truly-and-artificially labeled samples to train a deep neural network model. Yet, such solutions need many supervised images for validation. We present a loop in which a deep neural network (VGG-16) is trained from a set with more correctly labeled samples along iterations, created by using t-SNE to project the features of its last max-pooling layer into a 2D embedded space in which labels are propagated using the Optimum-Path Forest semi-supervised classifier. As the labeled set improves along iterations, it improves the features of the neural network. We show that this can significantly improve classification results on test data (using only 1% to 5% of supervised samples) of three private challenging datasets and two public ones.

Download