No Arabic abstract
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data. While the mainstream technique seeks to completely filter out the OOD samples for semi-supervised learning (SSL), we propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning while avoiding its adverse impact on the SSL. We achieve this goal by first introducing a warm-up training that leverages all the unlabeled data, including both the in-distribution (ID) and OOD samples. Specifically, we perform a pretext task that enforces our feature extractor to obtain a high-level semantic understanding of the training images, leading to more discriminative features that can benefit the downstream tasks. Since the OOD samples are inevitably detrimental to SSL, we propose a novel cross-modal matching strategy to detect OOD samples. Instead of directly applying binary classification, we train the network to predict whether the data sample is matched to an assigned one-hot class label. The appeal of the proposed cross-modal matching over binary classification is the ability to generate a compatible feature space that aligns with the core classification task. Extensive experiments show that our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
Modern semi-supervised learning methods conventionally assume both labeled and unlabeled data have the same class distribution. However, unlabeled data may include out-of-class samples in practice; those that cannot have one-hot encoded labels from a closed-set of classes in label data, i.e., unlabeled data is an open-set. In this paper, we introduce OpenCoS, a method for handling this realistic semi-supervised learning scenario based on a recent framework of contrastive learning. One of our key findings is that out-of-class samples in the unlabeled dataset can be identified effectively via (unsupervised) contrastive learning. OpenCoS utilizes this information to overcome the failure modes in the existing state-of-the-art semi-supervised methods, e.g., ReMixMatch or FixMatch. It further improves the semi-supervised performance by utilizing soft- and pseudo-labels on open-set unlabeled data, learned from contrastive learning. Our extensive experimental results show the effectiveness of OpenCoS, fixing the state-of-the-art semi-supervised methods to be suitable for diverse scenarios involving open-set unlabeled data.
Semi-supervised learning (SSL) is an effective means to leverage unlabeled data to improve a models performance. Typical SSL methods like FixMatch assume that labeled and unlabeled data share the same label space. However, in practice, unlabeled data can contain categories unseen in the labeled set, i.e., outliers, which can significantly harm the performance of SSL algorithms. To address this problem, we propose a novel Open-set Semi-Supervised Learning (OSSL) approach called OpenMatch. Learning representations of inliers while rejecting outliers is essential for the success of OSSL. To this end, OpenMatch unifies FixMatch with novelty detection based on one-vs-all (OVA) classifiers. The OVA-classifier outputs the confidence score of a sample being an inlier, providing a threshold to detect outliers. Another key contribution is an open-set soft-consistency regularization loss, which enhances the smoothness of the OVA-classifier with respect to input transformations and greatly improves outlier detection. OpenMatch achieves state-of-the-art performance on three datasets, and even outperforms a fully supervised model in detecting outliers unseen in unlabeled data on CIFAR10.
Pseudo-labeling (PL) and Data Augmentation-based Consistency Training (DACT) are two approaches widely used in Semi-Supervised Learning (SSL) methods. These methods exhibit great power in many machine learning tasks by utilizing unlabeled data for efficient training. But in a more realistic setting (termed as open-set SSL), where unlabeled dataset contains out-of-distribution (OOD) samples, the traditional SSL methods suffer severe performance degradation. Recent approaches mitigate the negative influence of OOD samples by filtering them out from the unlabeled data. However, it is not clear whether directly removing the OOD samples is the best choice. Furthermore, why PL and DACT could perform differently in open-set SSL remains a mystery. In this paper, we thoroughly analyze various SSL methods (PL and DACT) on open-set SSL and discuss pros and cons of these two approaches separately. Based on our analysis, we propose Style Disturbance to improve traditional SSL methods on open-set SSL and experimentally show our approach can achieve state-of-the-art results on various datasets by utilizing OOD samples properly. We believe our study can bring new insights for SSL research.
Lesion detection serves a critical role in early diagnosis and has been well explored in recent years due to methodological advancesand increased data availability. However, the high costs of annotations hinder the collection of large and completely labeled datasets, motivating semi-supervised detection approaches. In this paper, we introduce mean teacher hetero-modal detection (MTHD), which addresses two important gaps in current semi-supervised detection. First, it is not obvious how to enforce unlabeled consistency constraints across the very different outputs of various detectors, which has resulted in various compromises being used in the state of the art. Using an anchor-free framework, MTHD formulates a mean teacher approach without such compromises, enforcing consistency on the soft-output of object centers and size. Second, multi-sequence data is often critical, e.g., for abdominal lesion detection, but unlabeled data is often missing sequences. To deal with this, MTHD incorporates hetero-modal learning in its framework. Unlike prior art, MTHD is able to incorporate an expansive set of consistency constraints that include geometric transforms and random sequence combinations. We train and evaluate MTHD on liver lesion detection using the largest MR lesion dataset to date (1099 patients with >5000 volumes). MTHD surpasses the best fully-supervised and semi-supervised competitors by 10.1% and 3.5%, respectively, in average sensitivity.
Cross-modal hashing aims to map heterogeneous multimedia data into a common Hamming space, which can realize fast and flexible retrieval across different modalities. Supervised cross-modal hashing methods have achieved considerable progress by incorporating semantic side information. However, they mainly have two limitations: (1) Heavily rely on large-scale labeled cross-modal training data which are labor intensive and hard to obtain. (2) Ignore the rich information contained in the large amount of unlabeled data across different modalities, especially the margin examples that are easily to be incorrectly retrieved, which can help to model the correlations. To address these problems, in this paper we propose a novel Semi-supervised Cross-Modal Hashing approach by Generative Adversarial Network (SCH-GAN). We aim to take advantage of GANs ability for modeling data distributions to promote cross-modal hashing learning in an adversarial way. The main contributions can be summarized as follows: (1) We propose a novel generative adversarial network for cross-modal hashing. In our proposed SCH-GAN, the generative model tries to select margin examples of one modality from unlabeled data when giving a query of another modality. While the discriminative model tries to distinguish the selected examples and true positive examples of the query. These two models play a minimax game so that the generative model can promote the hashing performance of discriminative model. (2) We propose a reinforcement learning based algorithm to drive the training of proposed SCH-GAN. The generative model takes the correlation score predicted by discriminative model as a reward, and tries to select the examples close to the margin to promote discriminative model by maximizing the margin between positive and negative data. Experiments on 3 widely-used datasets verify the effectiveness of our proposed approach.