ﻻ يوجد ملخص باللغة العربية
Adversarial examples have appeared as a ubiquitous property of machine learning models where bounded adversarial perturbation could mislead the models to make arbitrarily incorrect predictions. Such examples provide a way to assess the robustness of machine learning models as well as a proxy for understanding the model training process. Extensive studies try to explain the existence of adversarial examples and provide ways to improve model robustness (e.g. adversarial training). While they mostly focus on models trained on datasets with predefined labels, we leverage the teacher-student framework and assume a teacher model, or oracle, to provide the labels for given instances. We extend Tian (2019) in the case of low-rank input data and show that student specialization (trained student neuron is highly correlated with certain teacher neuron at the same layer) still happens within the input subspace, but the teacher and student nodes could differ wildly out of the data subspace, which we conjecture leads to adversarial examples. Extensive experiments show that student specialization correlates strongly with model robustness in different scenarios, including student trained via standard training, adversarial training, confidence-calibrated adversarial training, and training with robust feature dataset. Our studies could shed light on the future exploration about adversarial examples, and enhancing model robustness via principled data augmentation.
A unique cognitive capability of humans consists in their ability to acquire new knowledge and skills from a sequence of experiences. Meanwhile, artificial intelligence systems are good at learning only the last given task without being able to remem
Numerous new dialog domains are being created every day while collecting data for these domains is extremely costly since it involves human interactions. Therefore, it is essential to develop algorithms that can adapt to different domains efficiently
In humans and animals, curriculum learning -- presenting data in a curated order - is critical to rapid learning and effective pedagogy. Yet in machine learning, curricula are not widely used and empirically often yield only moderate benefits. This s
Recently, consistency-based methods have achieved state-of-the-art results in semi-supervised learning (SSL). These methods always involve two roles, an explicit or implicit teacher model and a student model, and penalize predictions under different
Contrastive learning (CL) has recently emerged as an effective approach to learning representation in a range of downstream tasks. Central to this approach is the selection of positive (similar) and negative (dissimilar) sets to provide the model the