ﻻ يوجد ملخص باللغة العربية
Self-supervised learning (SSL) is rapidly closing the gap with supervised methods on large computer vision benchmarks. A successful approach to SSL is to learn embeddings which are invariant to distortions of the input sample. However, a recurring issue with this approach is the existence of trivial constant solutions. Most current methods avoid such solutions by careful implementation details. We propose an objective function that naturally avoids collapse by measuring the cross-correlation matrix between the outputs of two identical networks fed with distort
Self-supervised learning (especially contrastive learning) has attracted great interest due to its tremendous potentials in learning discriminative representations in an unsupervised manner. Despite the acknowledged successes, existing contrastive le
This paper investigates two techniques for developing efficient self-supervised vision transformers (EsViT) for visual representation learning. First, we show through a comprehensive empirical study that multi-stage architectures with sparse self-att
Point clouds provide a compact and efficient representation of 3D shapes. While deep neural networks have achieved impressive results on point cloud learning tasks, they require massive amounts of manually labeled data, which can be costly and time-c
Self-training is an effective approach to semi-supervised learning. The key idea is to let the learner itself iteratively generate pseudo-supervision for unlabeled instances based on its current hypothesis. In combination with consistency regularizat
Recent advances in self-supervised learning (SSL) have largely closed the gap with supervised ImageNet pretraining. Despite their success these methods have been primarily applied to unlabeled ImageNet images, and show marginal gains when trained on