Fooling Network Interpretation in Image Classification

72 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Akshayvarun Subramanya

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Akshayvarun Subramanya - Vipin Pillai - Hamed Pirsiavash

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep neural networks have been shown to be fooled rather easily using adversarial attack algorithms. Practical methods such as adversarial patches have been shown to be extremely effective in causing misclassification. However, these patches are highlighted using standard network interpretation algorithms, thus revealing the identity of the adversary. We show that it is possible to create adversarial patches which not only fool the prediction, but also change what we interpret regarding the cause of the prediction. Moreover, we introduce our attack as a controlled setting to measure the accuracy of interpretation algorithms. We show this using extensive experiments for Grad-CAM interpretation that transfers to occluding patch interpretation as well. We believe our algorithms can facilitate developing more robust network interpretation tools that truly explain the networks underlying decision making process.

قيم البحث

117 - Hai Phan , Dang Huynh , Yihui He 2019

MobileNet and Binary Neural Networks are two among the most widely used techniques to construct deep learning models for performing a variety of tasks on mobile and embedded platforms.In this paper, we present a simple yet efficient scheme to exploit MobileNet binarization at activation function and model weights. However, training a binary network from scratch with separable depth-wise and point-wise convolutions in case of MobileNet is not trivial and prone to divergence. To tackle this training issue, we propose a novel neural network architecture, namely MoBiNet - Mobile Binary Network in which skip connections are manipulated to prevent information loss and vanishing gradient, thus facilitate the training process. More importantly, while existing binary neural networks often make use of cumbersome backbones such as Alex-Net, ResNet, VGG-16 with float-type pre-trained weights initialization, our MoBiNet focuses on binarizing the already-compressed neural networks like MobileNet without the need of a pre-trained model to start with. Therefore, our proposal results in an effectively small model while keeping the accuracy comparable to existing ones. Experiments on ImageNet dataset show the potential of the MoBiNet as it achieves 54.40% top-1 accuracy and dramatically reduces the computational cost with binary operators.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Spectral-Spatial Graph Reasoning Network for Hyperspectral Image Classification

143 - Di Wang , Bo Du , Liangpei Zhang 2021

In this paper, we propose a spectral-spatial graph reasoning network (SSGRN) for hyperspectral image (HSI) classification. Concretely, this network contains two parts that separately named spatial graph reasoning subnetwork (SAGRN) and spectral graph reasoning subnetwork (SEGRN) to capture the spatial and spectral graph contexts, respectively. Different from the previous approaches implementing superpixel segmentation on the original image or attempting to obtain the category features under the guide of label image, we perform the superpixel segmentation on intermediate features of the network to adaptively produce the homogeneous regions to get the effective descriptors. Then, we adopt a similar idea in spectral part that reasonably aggregating the channels to generate spectral descriptors for spectral graph contexts capturing. All graph reasoning procedures in SAGRN and SEGRN are achieved through graph convolution. To guarantee the global perception ability of the proposed methods, all adjacent matrices in graph reasoning are obtained with the help of non-local self-attention mechanism. At last, by combining the extracted spatial and spectral graph contexts, we obtain the SSGRN to achieve a high accuracy classification. Extensive quantitative and qualitative experiments on three public HSI benchmarks demonstrate the competitiveness of the proposed methods compared with other state-of-the-art approaches.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

Image Labeling on a Network: Using Social-Network Metadata for Image Classification

361 - Julian McAuley , Jure Leskovec 2012

Large-scale image retrieval benchmarks invariably consist of images from the Web. Many of these benchmarks are derived from online photo sharing networks, like Flickr, which in addition to hosting images also provide a highly interactive social commu nity. Such communities generate rich metadata that can naturally be harnessed for image classification and retrieval. Here we study four popular benchmark datasets, extending them with social-network metadata, such as the groups to which each image belongs, the comment thread associated with the image, who uploaded it, their location, and their network of friends. Since these types of data are inherently relational, we propose a model that explicitly accounts for the interdependencies between images sharing common properties. We model the task as a binary labeling problem on a network, and use structured learning techniques to learn model parameters. We find that social-network metadata are useful in a variety of classification tasks, in many cases outperforming methods based on image content.

الرؤية الحاسوبية وتمييز الأنماط الشبكات الاجتماعية والمعلومات الفيزياء والمجتمع

Frequency Domain Convolutional Neural Network: Accelerated CNN for Large Diabetic Retinopathy Image Classification

134 - Ee Fey Goh , ZhiYuan Chen , Wei Xiang Lim 2021

The conventional spatial convolution layers in the Convolutional Neural Networks (CNNs) are computationally expensive at the point where the training time could take days unless the number of layers, the number of training images or the size of the t raining images are reduced. The image size of 256x256 pixels is commonly used for most of the applications of CNN, but this image size is too small for applications like Diabetic Retinopathy (DR) classification where the image details are important for accurate classification. This research proposed Frequency Domain Convolution (FDC) and Frequency Domain Pooling (FDP) layers which were built with RFFT, kernel initialization strategy, convolution artifact removal and Channel Independent Convolution (CIC) to replace the conventional convolution and pooling layers. The FDC and FDP layers are used to build a Frequency Domain Convolutional Neural Network (FDCNN) to accelerate the training of large images for DR classification. The Full FDC layer is an extension of the FDC layer to allow direct use in conventional CNNs, it is also used to modify the VGG16 architecture. FDCNN is shown to be at least 54.21% faster and 70.74% more memory efficient compared to an equivalent CNN architecture. The modified VGG16 architecture with Full FDC layer is reported to achieve a shorter training time and a higher accuracy at 95.63% compared to the original VGG16 architecture for DR classification.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Deep Subdomain Adaptation Network for Image Classification

136 - Yongchun Zhu , Fuzhen Zhuang , Jindong Wang 2021

For a target task where labeled data is unavailable, domain adaptation can transfer a learner from a different source domain. Previous deep domain adaptation methods mainly learn a global domain shift, i.e., align the global source and target distrib utions without considering the relationships between two subdomains within the same category of different domains, leading to unsatisfying transfer learning performance without capturing the fine-grained information. Recently, more and more researchers pay attention to Subdomain Adaptation which focuses on accurately aligning the distributions of the relevant subdomains. However, most of them are adversarial methods which contain several loss functions and converge slowly. Based on this, we present Deep Subdomain Adaptation Network (DSAN) which learns a transfer network by aligning the relevant subdomain distributions of domain-specific layer activations across different domains based on a local maximum mean discrepancy (LMMD). Our DSAN is very simple but effective which does not need adversarial training and converges fast. The adaptation can be achieved easily with most feed-forward network models by extending them with LMMD loss, which can be trained efficiently via back-propagation. Experiments demonstrate that DSAN can achieve remarkable results on both object recognition tasks and digit classification tasks. Our code will be available at: https://github.com/easezyc/deep-transfer-learning

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي