Classification regions of deep neural networks

129 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Seyed-Mohsen Moosavi-Dezfooli

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Alhussein Fawzi - Seyed-Mohsen Moosavi-Dezfooli - Pascal Frossard

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The goal of this paper is to analyze the geometric properties of deep neural network classifiers in the input space. We specifically study the topology of classification regions created by deep networks, as well as their associated decision boundary. Through a systematic empirical investigation, we show that state-of-the-art deep nets learn connected classification regions, and that the decision boundary in the vicinity of datapoints is flat along most directions. We further draw an essential connection between two seemingly unrelated properties of deep networks: their sensitivity to additive perturbations in the inputs, and the curvature of their decision boundary. The directions where the decision boundary is curved in fact remarkably characterize the directions to which the classifier is the most vulnerable. We finally leverage a fundamental asymmetry in the curvature of the decision boundary of deep nets, and propose a method to discriminate between original images, and images perturbed with small adversarial examples. We show the effectiveness of this purely geometric approach for detecting small adversarial perturbations in images, and for recovering the labels of perturbed images.

قيم البحث

157 - Shenlong Wang , Simon Suo , Wei-Chiu Ma 2021

Standard convolutional neural networks assume a grid structured input is available and exploit discrete convolutions as their fundamental building blocks. This limits their applicability to many real-world applications. In this paper we propose Param etric Continuous Convolution, a new learnable operator that operates over non-grid structured data. The key idea is to exploit parameterized kernel functions that span the full continuous vector space. This generalization allows us to learn over arbitrary data structures as long as their support relationship is computable. Our experiments show significant improvement over the state-of-the-art in point cloud segmentation of indoor and outdoor scenes, and lidar motion estimation of driving scenes.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي

Evaluation of Dataflow through layers of Deep Neural Networks in Classification and Regression Problems

112 - Ahmad Kalhor , Mohsen Saffar , Melika Kheirieh 2019

This paper introduces two straightforward, effective indices to evaluate the input data and the data flowing through layers of a feedforward deep neural network. For classification problems, the separation rate of target labels in the space of datafl ow is explained as a key factor indicating the performance of designed layers in improving the generalization of the network. According to the explained concept, a shapeless distance-based evaluation index is proposed. Similarly, for regression problems, the smoothness rate of target outputs in the space of dataflow is explained as a key factor indicating the performance of designed layers in improving the generalization of the network. According to the explained smoothness concept, a shapeless distance-based smoothness index is proposed for regression problems. To consider more strictly concepts of separation and smoothness, their extend

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي نظرية المعلومات

Multi-path Neural Networks for On-device Multi-domain Visual Classification

92 - Qifei Wang , Junjie Ke , Joshua Greaves 2020

Learning multiple domains/tasks with a single model is important for improving data efficiency and lowering inference cost for numerous vision tasks, especially on resource-constrained mobile devices. However, hand-crafting a multi-domain/task model can be both tedious and challenging. This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classification on mobile devices. The proposed multi-path network is learned from neural architecture search by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space. An adaptive balanced domain prioritization algorithm is proposed to balance optimizing the joint model on multiple domains simultaneously. The determined multi-path model selectively shares parameters across domains in shared nodes while keeping domain-specific parameters within non-shared nodes in individual domain paths. This approach effectively reduces the total number of parameters and FLOPS, encouraging positive knowledge transfer while mitigating negative interference across domains. Extensive evaluations on the Visual Decathlon dataset demonstrate that the proposed multi-path model achieves state-of-the-art performance in terms of accuracy, model size, and FLOPS against other approaches using MobileNetV3-like architectures. Furthermore, the proposed method improves average accuracy over learning single-domain models individually, and reduces the total number of parameters and FLOPS by 78% and 32% respectively, compared to the approach that simply bundles single-domain models for multi-domain learning.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي

Deep Neural Networks for the Assessment of Surgical Skills: A Systematic Review

77 - Erim Yanik , Xavier Intes , Uwe Kruger 2021

Surgical training in medical school residency programs has followed the apprenticeship model. The learning and assessment process is inherently subjective and time-consuming. Thus, there is a need for objective methods to assess surgical skills. Here , we use the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines to systematically survey the literature on the use of Deep Neural Networks for automated and objective surgical skill assessment, with a focus on kinematic data as putative markers of surgical competency. There is considerable recent interest in deep neural networks (DNN) due to the availability of powerful algorithms, multiple datasets, some of which are publicly available, as well as efficient computational hardware to train and host them. We have reviewed 530 papers, of which we selected 25 for this systematic review. Based on this review, we concluded that DNNs are powerful tools for automated, objective surgical skill assessment using both kinematic and video data. The field would benefit from large, publicly available, annotated datasets that are representative of the surgical trainee and expert demographics and multimodal data beyond kinematics and videos.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي

Explaining Deep Convolutional Neural Networks on Music Classification

82 - Keunwoo Choi , George Fazekas , Mark Sandler 2016

Deep convolutional neural networks (CNNs) have been actively adopted in the field of music information retrieval, e.g. genre classification, mood detection, and chord recognition. However, the process of learning and prediction is little understood, particularly when it is applied to spectrograms. We introduce auralisation of a CNN to understand its underlying mechanism, which is based on a deconvolution procedure introduced in [2]. Auralisation of a CNN is converting the learned convolutional features that are obtained from deconvolution into audio signals. In the experiments and discussions, we explain trained features of a 5-layer CNN based on the deconvolved spectrograms and auralised signals. The pairwise correlations per layers with varying different musical attributes are also investigated to understand the evolution of the learnt features. It is shown that in the deep layers, the features are learnt to capture textures, the patterns of continuous distributions, rather than shapes of lines.

التعلم الآلي الذكاء الاصطناعي الوسائط المتعددة