ﻻ يوجد ملخص باللغة العربية
As a subset of unsupervised representation learning, self-supervised representation learning adopts self-defined signals as supervision and uses the learned representation for downstream tasks, such as object detection and image captioning. Many proposed approaches for self-supervised learning follow naturally a multi-view perspective, where the input (e.g., original images) and the self-supervised signals (e.g., augmented images) can be seen as two redundant views of the data. Building from this multi-view perspective, this paper provides an information-theoretical framework to better understand the properties that encourage successful self-supervised learning. Specifically, we demonstrate that self-supervised learned representations can extract task-relevant information and discard task-irrelevant information. Our theoretical framework paves the way to a larger space of self-supervised learning objective design. In particular, we propose a composite objective that bridges the gap between prior contrastive and predictive learning objectives, and introduce an additional objective term to discard task-irrelevant information. To verify our analysis, we conduct controlled experiments to evaluate the impact of the composite objectives. We also explore our frameworks empirical generalization beyond the multi-view perspective, where the cross-view redundancy may not be clearly observed.
As a newly emerging unsupervised learning paradigm, self-supervised learning (SSL) recently gained widespread attention, which usually introduces a pretext task without manual annotation of data. With its help, SSL effectively learns the feature repr
Self-supervised metric learning has been a successful approach for learning a distance from an unlabeled dataset. The resulting distance is broadly useful for improving various distance-based downstream tasks, even when no information from downstream
While neural networks for learning representation of multi-view data have been previously proposed as one of the state-of-the-art multi-view dimension reduction techniques, how to make the representation discriminative with only a small amount of lab
Multi-view clustering is an important research topic due to its capability to utilize complementary information from multiple views. However, there are few methods to consider the negative impact caused by certain views with unclear clustering struct
Multi-typed objects Multi-view Multi-instance Multi-label Learning (M4L) deals with interconnected multi-typed objects (or bags) that are made of diverse instances, represented with heterogeneous feature views and annotated with a set of non-exclusiv