ﻻ يوجد ملخص باللغة العربية
Deep neural networks bring in impressive accuracy in various applications, but the success often relies on the heavy network architecture. Taking well-trained heavy networks as teachers, classical teacher-student learning paradigm aims to learn a student network that is lightweight yet accurate. In this way, a portable student network with significantly fewer parameters can achieve a considerable accuracy which is comparable to that of teacher network. However, beyond accuracy, robustness of the learned student network against perturbation is also essential for practical uses. Existing teacher-student learning frameworks mainly focus on accuracy and compression ratios, but ignore the robustness. In this paper, we make the student network produce more confident predictions with the help of the teacher network, and analyze the lower bound of the perturbation that will destroy the confidence of the student network. Two important objectives regarding prediction scores and gradients of examples are developed to maximize this lower bound, so as to enhance the robustness of the student network without sacrificing the performance. Experiments on benchmark datasets demonstrate the efficiency of the proposed approach to learn robust student networks which have satisfying accuracy and compact sizes.
A unique cognitive capability of humans consists in their ability to acquire new knowledge and skills from a sequence of experiences. Meanwhile, artificial intelligence systems are good at learning only the last given task without being able to remem
Recently, consistency-based methods have achieved state-of-the-art results in semi-supervised learning (SSL). These methods always involve two roles, an explicit or implicit teacher model and a student model, and penalize predictions under different
As machine learning (ML) systems become pervasive, safeguarding their security is critical. Recent work has demonstrated that motivated adversaries could add adversarial perturbations to the test data to mislead ML systems. So far, most research has
Convolutional Neural Networks (CNNs) are known to rely more on local texture rather than global shape when making decisions. Recent work also indicates a close relationship between CNNs texture-bias and its robustness against distribution shift, adve
In humans and animals, curriculum learning -- presenting data in a curated order - is critical to rapid learning and effective pedagogy. Yet in machine learning, curricula are not widely used and empirically often yield only moderate benefits. This s