A Closed-Form Learned Pooling for Deep Classification Networks

85 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hossein Mobahi

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Vighnesh Birodkar - Hossein Mobahi - Dilip Krishnan

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In modern computer vision tasks, convolutional neural networks (CNNs) are indispensable for image classification tasks due to their efficiency and effectiveness. Part of their superiority compared to other architectures, comes from the fact that a single, local filter is shared across the entire image. However, there are scenarios where we may need to treat spatial locations in non-uniform manner. We see this in nature when considering how humans have evolved foveation to process different areas in their field of vision with varying levels of detail. In this paper we propose a way to enable CNNs to learn different pooling weights for each pixel location. We do so by introducing an extended definition of a pooling operator. This operator can learn a strict super-set of what can be learned by average pooling or convolutions. It has the benefit of being shared across feature maps and can be encouraged to be local or diffuse depending on the data. We show that for fixed network weights, our pooling operator can be computed in closed-form by spectral decomposition of matrices associated with class separability. Through experiments, we show that this operator benefits generalization for ResNets and CNNs on the CIFAR-10, CIFAR-100 and SVHN datasets and improves robustness to geometric corruptions and perturbations on the CIFAR-10-C and CIFAR-10-P test sets.

قيم البحث

88 - Maosen Li , Siheng Chen , Ya Zhang 2020

We propose a novel graph cross network (GXN) to achieve comprehensive feature learning from multiple scales of a graph. Based on trainable hierarchical representations of a graph, GXN enables the interchange of intermediate features across scales to promote information flow. Two key ingredients of GXN include a novel vertex infomax pooling (VIPool), which creates multiscale graphs in a trainable manner, and a novel feature-crossing layer, enabling feature interchange across scales. The proposed VIPool selects the most informative subset of vertices based on the neural estimation of mutual information between vertex features and neighborhood features. The intuition behind is that a vertex is informative when it can maximally reflect its neighboring information. The proposed feature-crossing layer fuses intermediate features between two scales for mutual enhancement by improving information flow and enriching multiscale features at hidden layers. The cross shape of the feature-crossing layer distinguishes GXN from many other multiscale architectures. Experimental results show that the proposed GXN improves the classification accuracy by 2.12% and 1.15% on average for graph classification and vertex classification, respectively. Based on the same network, the proposed VIPool consistently outperforms other graph-pooling methods.

التعلم الآلي التعلم الالي

Co-Representation Learning For Classification and Novel Class Detection via Deep Networks

185 - Zhuoyi Wang , Zelun Kong , Hemeng Tao 2018

One of the key challenges of performing label prediction over a data stream concerns with the emergence of instances belonging to unobserved class labels over time. Previously, this problem has been addressed by detecting such instances and using the m for appropriate classifier adaptation. The fundamental aspect of a novel-class detection strategy relies on the ability of comparison among observed instances to discriminate them into known and unknown classes. Therefore, studies in the past have proposed various metrics suitable for comparison over the observed feature space. Unfortunately, these similarity measures fail to reliably identify distinct regions in observed feature spaces useful for class discrimination and novel-class detection, especially in streams containing high-dimensional data instances such as images and texts. In this paper, we address this key challenge by proposing a semi-supervised multi-task learning framework called sysname{} which aims to intrinsically search for a latent space suitable for detecting labels of instances from both known and unknown classes. We empirically measure the performance of sysname{} over multiple real-world image and text datasets and demonstrate its superiority by comparing its performance with existing semi-supervised methods.

التعلم الآلي التعلم الالي

Autoencoders as Weight Initialization of Deep Classification Networks for Cancer versus Cancer Studies

112 - Mafalda Falcao Ferreira , Rui Camacho , Luis F. Teixeira 2020

Cancer is still one of the most devastating diseases of our time. One way of automatically classifying tumor samples is by analyzing its derived molecular information (i.e., its genes expression signatures). In this work, we aim to distinguish three different types of cancer: thyroid, skin, and stomach. For that, we compare the performance of a Denoising Autoencoder (DAE) used as weight initialization of a deep neural network. Although we address a different domain problem in this work, we have adopted the same methodology of Ferreira et al.. In our experiments, we assess two different approaches when training the classification model: (a) fixing the weights, after pre-training the DAE, and (b) allowing fine-tuning of the entire classification network. Additionally, we apply two different strategies for embedding the DAE into the classification network: (1) by only importing the encoding layers, and (2) by inserting the complete autoencoder. Our best result was the combination of unsupervised feature learning through a DAE, followed by its full import into the classification network, and subsequent fine-tuning through supervised training, achieving an F1 score of 98.04% +/- 1.09 when identifying cancerous thyroid samples.

التعلم الآلي التعلم الالي

Quantifying Intrinsic Uncertainty in Classification via Deep Dirichlet Mixture Networks

112 - Qingyang Wu , He Li , Lexin Li 2019

With the widespread success of deep neural networks in science and technology, it is becoming increasingly important to quantify the uncertainty of the predictions produced by deep learning. In this paper, we introduce a new method that attaches an e xplicit uncertainty statement to the probabilities of classification using deep neural networks. Precisely, we view that the classification probabilities are sampled from an unknown distribution, and we propose to learn this distribution through the Dirichlet mixture that is flexible enough for approximating any continuous distribution on the simplex. We then construct credible intervals from the learned distribution to assess the uncertainty of the classification probabilities. Our approach is easy to implement, computationally efficient, and can be coupled with any deep neural network architecture. Our method leverages the crucial observation that, in many classification applications such as medical diagnosis, more than one class labels are available for each observational unit. We demonstrate the usefulness of our approach through simulations and a real data example.

التعلم الآلي التعلم الالي

Deep Learning with a Rethinking Structure for Multi-label Classification

116 - Yao-Yuan Yang , Yi-An Lin , Hong-Min Chu 2018

Multi-label classification (MLC) is an important class of machine learning problems that come with a wide spectrum of applications, each demanding a possibly different evaluation criterion. When solving the MLC problems, we generally expect the learn ing algorithm to take the hidden correlation of the labels into account to improve the prediction performance. Extracting the hidden correlation is generally a challenging task. In this work, we propose a novel deep learning framework to better extract the hidden correlation with the help of the memory structure within recurrent neural networks. The memory stores the temporary guesses on the labels and effectively allows the framework to rethink about the goodness and correlation of the guesses before making the final prediction. Furthermore, the rethinking process makes it easy to adapt to different evaluation criteria to match real-world application needs. In particular, the framework can be trained in an end-to-end style with respect to any given MLC evaluation criteria. The end-to-end design can be seamlessly combined with other deep learning techniques to conquer challenging MLC problems like image tagging. Experimental results across many real-world data sets justify that the rethinking framework indeed improves MLC performance across different evaluation criteria and leads to superior performance over state-of-the-art MLC algorithms.

التعلم الآلي التعلم الالي