A Robust Classification-autoencoder to Defend Outliers and Adversaries


Abstract in English

In this paper, we present a robust classification-autoencoder (CAE) which has strong ability to recognize outliers and defend adversaries. The basic idea is to change the autoencoder from an unsupervised learning method into a classifier. The CAE is a modified autoencoder, where the encoder is used to compress samples with different labels into disjoint compression spaces and the decoder is used to recover a sample with a given label from the corresponding compression space. The encoder is used as a classifier and the decoder is used to decide whether the classification given by the encoder is correct by comparing the input sample with the output. Since adversary samples are seeming inevitable for the current DNN framework, we introduce the list classification based on CAE to defend adversaries, which outputs several labels and the corresponding samples recovered by the CAE. The CAE is evaluated using the MNIST dataset in great detail. It is shown that the CAE network can recognize almost all outliers and the list classification contains the correct label for almost all adversaries.

Download