New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Learning Important Features Through Propagating Activation Differences

62 0 0.0 ( 0 )

Download Cite

Added by Avanti Shrikumar

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Avanti Shrikumar - Peyton Greenside - Anshul Kundaje

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The purported black box nature of neural networks is a barrier to adoption in applications where interpretability is essential. Here we present DeepLIFT (Deep Learning Important FeaTures), a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input. DeepLIFT compares the activation of each neuron to its reference activation and assigns contribution scores according to the difference. By optionally giving separate consideration to positive and negative contributions, DeepLIFT can also reveal dependencies which are missed by other approaches. Scores can be computed efficiently in a single backward pass. We apply DeepLIFT to models trained on MNIST and simulated genomic data, and show significant advantages over gradient-based methods. Video tutorial: http://goo.gl/qKb7pL, ICML slides: bit.ly/deeplifticmlslides, ICML talk: https://vimeo.com/238275076, code: http://goo.gl/RM8jvH.

rate research

Not Just a Black Box: Learning Important Features Through Propagating Activation Differences

90 - Avanti Shrikumar , Peyton Greenside , Anna Shcherbina 2016

Note: This paper describes an older version of DeepLIFT. See https://arxiv.org/abs/1704.02685 for the newer version. Original abstract follows: The purported black box nature of neural networks is a barrier to adoption in applications where interpretability is essential. Here we present DeepLIFT (Learning Important FeaTures), an efficient and effective method for computing importance scores in a neural network. DeepLIFT compares the activation of each neuron to its reference activation and assigns contribution scores according to the difference. We apply DeepLIFT to models trained on natural images and genomic data, and show significant advantages over gradient-based methods.

Machine Learning Computer Vision and Pattern Recognition Neural and Evolutionary Computing

What it Thinks is Important is Important: Robustness Transfers through Input Gradients

198 - Alvin Chan , Yi Tay , Yew-Soon Ong 2019

Adversarial perturbations are imperceptible changes to input pixels that can change the prediction of deep learning models. Learned weights of models robust to such perturbations are previously found to be transferable across different tasks but this applies only if the model architecture for the source and target tasks is the same. Input gradients characterize how small changes at each input pixel affect the model output. Using only natural images, we show here that training a student models input gradients to match those of a robust teacher model can gain robustness close to a strong baseline that is robustly trained from scratch. Through experiments in MNIST, CIFAR-10, CIFAR-100 and Tiny-ImageNet, we show that our proposed method, input gradient adversarial matching, can transfer robustness across different tasks and even across different model architectures. This demonstrates that directly targeting the semantics of input gradients is a feasible way towards adversarial robustness.

Machine Learning Computer Vision and Pattern Recognition Neural and Evolutionary Computing

Learning Multiscale Features Directly From Waveforms

199 - Zhenyao Zhu , Jesse H. Engel , Awni Hannun 2016

Deep learning has dramatically improved the performance of speech recognition systems through learning hierarchies of features optimized for the task at hand. However, true end-to-end learning, where features are learned directly from waveforms, has only recently reached the performance of hand-tailored representations based on the Fourier transform. In this paper, we detail an approach to use convolutional filters to push past the inherent tradeoff of temporal and frequency resolution that exists for spectral representations. At increased computational cost, we show that increasing temporal resolution via reduced stride and increasing frequency resolution via additional filters delivers significant performance improvements. Further, we find more efficient representations by simultaneously learning at multiple scales, leading to an overall decrease in word error rate on a difficult internal speech test set by 20.7% relative to networks with the same number of parameters trained on spectrograms.

Computation and Language Machine Learning Neural and Evolutionary Computing

Angular Learning: Toward Discriminative Embedded Features

92 - JT Wu , L.Wang 2019

The margin-based softmax loss functions greatly enhance intra-class compactness and perform well on the tasks of face recognition and object classification. Outperformance, however, depends on the careful hyperparameter selection. Moreover, the hard angle restriction also increases the risk of overfitting. In this paper, angular loss suggested by maximizing the angular gradient to promote intra-class compactness avoids overfitting. Besides, our method has only one adjustable constant for intra-class compactness control. We define three metrics to measure inter-class separability and intra-class compactness. In experiments, we test our method, as well as other methods, on many well-known datasets. Experimental results reveal that our method has the superiority of accuracy improvement, discriminative information, and time-consumption.

Computer Vision and Pattern Recognition Machine Learning Machine Learning

Learning Discriminative Features via Label Consistent Neural Network

88 - Zhuolin Jiang , Yaming Wang , Larry Davis 2016

Deep Convolutional Neural Networks (CNN) enforces supervised information only at the output layer, and hidden layers are trained by back propagating the prediction error from the output layer without explicit supervision. We propose a supervised feature learning approach, Label Consistent Neural Network, which enforces direct supervision in late hidden layers. We associate each neuron in a hidden layer with a particular class label and encourage it to be activated for input signals from the same class. More specifically, we introduce a label consistency regularization called discriminative representation error loss for late hidden layers and combine it with classification error loss to build our overall objective function. This label consistency constraint alleviates the common problem of gradient vanishing and tends to faster convergence; it also makes the features derived from late hidden layers discriminative enough for classification even using a simple $k$-NN classifier, since input signals from the same class will have very similar representations. Experimental results demonstrate that our approach achieves state-of-the-art performances on several public benchmarks for action and object category recognition.

Computer Vision and Pattern Recognition Machine Learning Multimedia

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Learning Important Features Through Propagating Activation Differences

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions