A Biologically Interpretable Two-stage Deep Neural Network (BIT-DNN) For Vegetation Recognition From Hyperspectral Imagery

87 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yue Shi

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yue Shi - Liangxiu Han - Wenjiang Huang

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Spectral-spatial based deep learning models have recently proven to be effective in hyperspectral image (HSI) classification for various earth monitoring applications such as land cover classification and agricultural monitoring. However, due to the nature of black-box model representation, how to explain and interpret the learning process and the model decision, especially for vegetation classification, remains an open challenge. This study proposes a novel interpretable deep learning model -- a biologically interpretable two-stage deep neural network (BIT-DNN), by incorporating the prior-knowledge (i.e. biophysical and biochemical attributes and their hierarchical structures of target entities) based spectral-spatial feature transformation into the proposed framework, capable of achieving both high accuracy and interpretability on HSI based classification tasks. The proposed model introduces a two-stage feature learning process: in the first stage, an enhanced interpretable feature block extracts the low-level spectral features associated with the biophysical and biochemical attributes of target entities; and in the second stage, an interpretable capsule block extracts and encapsulates the high-level joint spectral-spatial features representing the hierarchical structure of biophysical and biochemical attributes of these target entities, which provides the model an improved performance on classification and intrinsic interpretability with reduced computational complexity. We have tested and evaluated the model using four real HSI datasets for four separate tasks (i.e. plant species classification, land cover classification, urban scene recognition, and crop disease recognition tasks). The proposed model has been compared with five state-of-the-art deep learning models.

قيم البحث

82 - Behnood Rasti , Danfeng Hong , Renlong Hang 2020

Hyperspectral images provide detailed spectral information through hundreds of (narrow) spectral channels (also known as dimensionality or bands) with continuous spectral information that can accurately classify diverse materials of interest. The inc reased dimensionality of such data makes it possible to significantly improve data information content but provides a challenge to the conventional techniques (the so-called curse of dimensionality) for accurate analysis of hyperspectral images. Feature extraction, as a vibrant field of research in the hyperspectral community, evolved through decades of research to address this issue and extract informative features suitable for data representation and classification. The advances in feature extraction have been inspired by two fields of research, including the popularization of image and signal processing as well as machine (deep) learning, leading to two types of feature extraction approaches named shallow and deep techniques. This article outlines the advances in feature extraction approaches for hyperspectral imagery by providing a technical overview of the state-of-the-art techniques, providing useful entry points for researchers at different levels, including students, researchers, and senior researchers, willing to explore novel investigations on this challenging topic. In more detail, this paper provides a birds eye view over shallow (both supervised and unsupervised) and deep feature extraction approaches specifically dedicated to the topic of hyperspectral feature extraction and its application on hyperspectral image classification. Additionally, this paper compares 15 advanced techniques with an emphasis on their methodological foundations in terms of classification accuracies. Furthermore, the codes and libraries are shared at https://github.com/BehnoodRasti/HyFTech-Hyperspectral-Shallow-Deep-Feature-Extraction-Toolbox.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

Learning long-term dependencies for action recognition with a biologically-inspired deep network

139 - Yemin Shi , Yonghong Tian , Yaowei Wang 2016

Despite a lot of research efforts devoted in recent years, how to efficiently learn long-term dependencies from sequences still remains a pretty challenging task. As one of the key models for sequence learning, recurrent neural network (RNN) and its variants such as long short term memory (LSTM) and gated recurrent unit (GRU) are still not powerful enough in practice. One possible reason is that they have only feedforward connections, which is different from the biological neural system that is typically composed of both feedforward and feedback connections. To address this problem, this paper proposes a biologically-inspired deep network, called shuttleNetfootnote{Our code is available at url{https://github.com/shiyemin/shuttlenet}}. Technologically, the shuttleNet consists of several processors, each of which is a GRU while associated with multiple groups of cells and states. Unlike traditional RNNs, all processors inside shuttleNet are loop connected to mimic the brains feedforward and feedback connections, in which they are shared across multiple pathways in the loop connection. Attention mechanism is then employed to select the best information flow pathway. Extensive experiments conducted on two benchmark datasets (i.e UCF101 and HMDB51) show that we can beat state-of-the-art methods by simply embedding shuttleNet into a CNN-RNN framework.

الرؤية الحاسوبية وتمييز الأنماط

GAN for Vision, KG for Relation: a Two-stage Deep Network for Zero-shot Action Recognition

131 - Bin Sun , Dehui Kong , Shaofan Wang 2021

Zero-shot action recognition can recognize samples of unseen classes that are unavailable in training by exploring common latent semantic representation in samples. However, most methods neglected the connotative relation and extensional relation bet ween the action classes, which leads to the poor generalization ability of the zero-shot learning. Furthermore, the learned classifier incline to predict the samples of seen class, which leads to poor classification performance. To solve the above problems, we propose a two-stage deep neural network for zero-shot action recognition, which consists of a feature generation sub-network serving as the sampling stage and a graph attention sub-network serving as the classification stage. In the sampling stage, we utilize a generative adversarial networks (GAN) trained by action features and word vectors of seen classes to synthesize the action features of unseen classes, which can balance the training sample data of seen classes and unseen classes. In the classification stage, we construct a knowledge graph (KG) based on the relationship between word vectors of action classes and related objects, and propose a graph convolution network (GCN) based on attention mechanism, which dynamically updates the relationship between action classes and objects, and enhances the generalization ability of zero-shot learning. In both stages, we all use word vectors as bridges for feature generation and classifier generalization from seen classes to unseen classes. We compare our method with state-of-the-art methods on UCF101 and HMDB51 datasets. Experimental results show that our proposed method improves the classification performance of the trained classifier and achieves higher accuracy.

الرؤية الحاسوبية وتمييز الأنماط

Spectral-Spatial Graph Reasoning Network for Hyperspectral Image Classification

143 - Di Wang , Bo Du , Liangpei Zhang 2021

In this paper, we propose a spectral-spatial graph reasoning network (SSGRN) for hyperspectral image (HSI) classification. Concretely, this network contains two parts that separately named spatial graph reasoning subnetwork (SAGRN) and spectral graph reasoning subnetwork (SEGRN) to capture the spatial and spectral graph contexts, respectively. Different from the previous approaches implementing superpixel segmentation on the original image or attempting to obtain the category features under the guide of label image, we perform the superpixel segmentation on intermediate features of the network to adaptively produce the homogeneous regions to get the effective descriptors. Then, we adopt a similar idea in spectral part that reasonably aggregating the channels to generate spectral descriptors for spectral graph contexts capturing. All graph reasoning procedures in SAGRN and SEGRN are achieved through graph convolution. To guarantee the global perception ability of the proposed methods, all adjacent matrices in graph reasoning are obtained with the help of non-local self-attention mechanism. At last, by combining the extracted spatial and spectral graph contexts, we obtain the SSGRN to achieve a high accuracy classification. Extensive quantitative and qualitative experiments on three public HSI benchmarks demonstrate the competitiveness of the proposed methods compared with other state-of-the-art approaches.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

A Deep Convolutional Neural Network for the Detection of Polyps in Colonoscopy Images

99 - Tariq Rahim , Syed Ali Hassan , Soo Young Shin 2020

Computerized detection of colonic polyps remains an unsolved issue because of the wide variation in the appearance, texture, color, size, and presence of the multiple polyp-like imitators during colonoscopy. In this paper, we propose a deep convoluti onal neural network based model for the computerized detection of polyps within colonoscopy images. The proposed model comprises 16 convolutional layers with 2 fully connected layers, and a Softmax layer, where we implement a unique approach using different convolutional kernels within the same hidden layer for deeper feature extraction. We applied two different activation functions, MISH and rectified linear unit activation functions for deeper propagation of information and self regularized smooth non-monotonicity. Furthermore, we used a generalized intersection of union, thus overcoming issues such as scale invariance, rotation, and shape. Data augmentation techniques such as photometric and geometric distortions are adapted to overcome the obstacles faced in polyp detection. Detailed benchmarked results are provided, showing better performance in terms of precision, sensitivity, F1- score, F2- score, and dice-coefficient, thus proving the efficacy of the proposed model.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو