ترغب بنشر مسار تعليمي؟ اضغط هنا

Towards Out-of-Distribution Detection with Divergence Guarantee in Deep Generative Models

458   0   0.0 ( 0 )
 نشر من قبل Yufeng Zhang
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Recent research has revealed that deep generative models including flow-based models and Variational autoencoders may assign higher likelihood to out-of-distribution (OOD) data than in-distribution (ID) data. However, we cannot sample out OOD data from the model. This counterintuitive phenomenon has not been satisfactorily explained. In this paper, we prove theorems to investigate the divergences in flow-based model and give two explanations to the above phenomenon from divergence and geometric perspectives, respectively. Based on our analysis, we propose two group anomaly detection methods. Furthermore, we decompose the KL divergence and propose a point-wise anomaly detection method. We have conducted extensive experiments on prevalent benchmarks to evaluate our methods. For group anomaly detection (GAD), our method can achieve near 100% AUROC on all problems and has robustness against data manipulations. On the contrary, the state-of-the-art (SOTA) GAD method performs not better than random guessing for challenging problems and can be attacked by data manipulation in almost all cases. For point-wise anomaly detection (PAD), our method is comparable to the SOTA PAD method on one category of problems and outperforms the baseline significantly on another category of problems.



قيم البحث

اقرأ أيضاً

Out-of-distribution (OoD) detection is a natural downstream task for deep generative models, due to their ability to learn the input probability distribution. There are mainly two classes of approaches for OoD detection using deep generative models, viz., based on likelihood measure and the reconstruction loss. However, both approaches are unable to carry out OoD detection effectively, especially when the OoD samples have smaller variance than the training samples. For instance, both flow based and VAE models assign higher likelihood to images from SVHN when trained on CIFAR-10 images. We use a recently proposed generative model known as neural rendering model (NRM) and derive metrics for OoD. We show that NRM unifies both approaches since it provides a likelihood estimate and also carries out reconstruction in each layer of the neural network. Among various measures, we found the joint likelihood of latent variables to be the most effective one for OoD detection. Our results show that when trained on CIFAR-10, lower likelihood (of latent variables) is assigned to SVHN images. Additionally, we show that this metric is consistent across other OoD datasets. To the best of our knowledge, this is the first work to show consistently lower likelihood for OoD data with smaller variance with deep generative models.
Generative adversarial training (GAT) is a recently introduced adversarial defense method. Previous works have focused on empirical evaluations of its application to training robust predictive models. In this paper we focus on theoretical understandi ng of the GAT method and extending its application to generative modeling and out-of-distribution detection. We analyze the optimal solutions of the maximin formulation employed by the GAT objective, and make a comparative analysis of the minimax formulation employed by GANs. We use theoretical analysis and 2D simulations to understand the convergence property of the training algorithm. Based on these results, we develop an incremental generative training algorithm, and conduct comprehensive evaluations of the algorithms application to image generation and adversarial out-of-distribution detection. Our results suggest that generative adversarial training is a promising new direction for the above applications.
Deep neural networks have achieved great success in classification tasks during the last years. However, one major problem to the path towards artificial intelligence is the inability of neural networks to accurately detect samples from novel class d istributions and therefore, most of the existent classification algorithms assume that all classes are known prior to the training stage. In this work, we propose a methodology for training a neural network that allows it to efficiently detect out-of-distribution (OOD) examples without compromising much of its classification accuracy on the test examples from known classes. We propose a novel loss function that gives rise to a novel method, Outlier Exposure with Confidence Control (OECC), which achieves superior results in OOD detection with OE both on image and text classification tasks without requiring access to OOD samples. Additionally, we experimentally show that the combination of OECC with state-of-the-art post-training OOD detection methods, like the Mahalanobis Detector (MD) and the Gramian Matrices (GM) methods, further improves their performance in the OOD detection task, demonstrating the potential of combining training and post-training methods for OOD detection.
In satellite image analysis, distributional mismatch between the training and test data may arise due to several reasons, including unseen classes in the test data and differences in the geographic area. Deep learning based models may behave in unexp ected manner when subjected to test data that has such distributional shifts from the training data, also called out-of-distribution (OOD) examples. Predictive uncertainly analysis is an emerging research topic which has not been explored much in context of satellite image analysis. Towards this, we adopt a Dirichlet Prior Network based model to quantify distributional uncertainty of deep learning models for remote sensing. The approach seeks to maximize the representation gap between the in-domain and OOD examples for a better identification of unknown examples at test time. Experimental results on three exemplary test scenarios show the efficacy of the model in satellite image analysis.
Deep generative models (e.g. GANs and VAEs) have been developed quite extensively in recent years. Lately, there has been an increased interest in the inversion of such a model, i.e. given a (possibly corrupted) signal, we wish to recover the latent vector that generated it. Building upon sparse representation theory, we define conditions that are applicable to any inversion algorithm (gradient descent, deep encoder, etc.), under which such generative models are invertible with a unique solution. Importantly, the proposed analysis is applicable to any trained model, and does not depend on Gaussian i.i.d. weights. Furthermore, we introduce two layer-wise inversion pursuit algorithms for trained generative networks of arbitrary depth, and accompany these with recovery guarantees. Finally, we validate our theoretical results numerically and show that our method outperforms gradient descent when inverting such generators, both for clean and corrupted signals.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا