On Data-Augmentation and Consistency-Based Semi-Supervised Learning

نشر في Alexandre Thiery بتاريخ 2021 في مجال الاحصاء الرياضي والبحث باللغة English تحميل البحث

الملخص بالإنكليزية

Recently proposed consistency-based Semi-Supervised Learning (SSL) methods such as the $Pi$-model, temporal ensembling, the mean teacher, or the virtual adversarial training, have advanced the state of the art in several SSL tasks. These methods can typically reach performances that are comparable to their fully supervised counterparts while using only a fraction of labelled examples. Despite these methodological advances, the understanding of these methods is still relatively limited. In this text, we analyse (variations of) the $Pi$-model in settings where analytically tractable results can be obtained. We establish links with Manifold Tangent Classifiers and demonstrate that the quality of the perturbations is key to obtaining reasonable SSL performances. Importantly, we propose a simple extension of the Hidden Manifold Model that naturally incorporates data-augmentation schemes and offers a framework for understanding and experimenting with SSL methods.

تحميل البحث