ﻻ يوجد ملخص باللغة العربية
While large self-supervised models have rivalled the performance of their supervised counterparts, small models still struggle. In this report, we explore simple baselines for improving small self-supervised models via distillation, called SimDis. Specifically, we present an offline-distillation baseline, which establishes a new state-of-the-art, and an online-distillation baseline, which achieves similar performance with minimal computational overhead. We hope these baselines will provide useful experience for relevant future research. Code is available at: https://github.com/JindongGu/SimDis/
This paper is concerned with self-supervised learning for small models. The problem is motivated by our empirical studies that while the widely used contrastive self-supervised learning method has shown great progress on large model training, it does
It is a consensus that small models perform quite poorly under the paradigm of self-supervised contrastive learning. Existing methods usually adopt a large off-the-shelf model to transfer knowledge to the small one via knowledge distillation. Despite
Knowledge distillation often involves how to define and transfer knowledge from teacher to student effectively. Although recent self-supervised contrastive knowledge achieves the best performance, forcing the network to learn such knowledge may damag
Weakly Supervised Object Detection (WSOD) has emerged as an effective tool to train object detectors using only the image-level category labels. However, without object-level labels, WSOD detectors are prone to detect bounding boxes on salient object
We present MoDist as a novel method to explicitly distill motion information into self-supervised video representations. Compared to previous video representation learning methods that mostly focus on learning motion cues implicitly from RGB inputs,