A New Gastric Histopathology Subsize Image Database (GasHisSDB) for Classification Algorithm Test: from Linear Regression to Visual Transformer

142 0 0.0 ( 0 )

Download Cite

Added by Weiming Hu

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Weiming Hu - Chen Li - Xiaoyan Li

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

GasHisSDB is a New Gastric Histopathology Subsize Image Database with a total of 245196 images. GasHisSDB is divided into 160160 pixels sub-database, 120120 pixels sub-database and 80*80 pixels sub-database. GasHisSDB is made to realize the function of valuating image classification. In order to prove that the methods of different periods in the field of image classification have discrepancies on GasHisSDB, we select a variety of classifiers for evaluation. Seven classical machine learning classifiers, three CNN classifiers and a novel transformer-based classifier are selected for testing on image classification tasks. GasHisSDB is available at the URL:https://github.com/NEUhwm/GasHisSDB.git.

rate research

GasHis-Transformer: A Multi-scale Visual Transformer Approach for Gastric Histopathology Image Classification

103 - Haoyuan Chen , Chen Li , Xiaoyan Li 2021

Existing deep learning methods for diagnosis of gastric cancer commonly use convolutional neural network. Recently, the Visual Transformer has attracted great attention because of its performance and efficiency, but its applications are mostly in the field of computer vision. In this paper, a multi-scale visual transformer model, referred to as GasHis-Transformer, is proposed for Gastric Histopathological Image Classification (GHIC), which enables the automatic classification of microscopic gastric images into abnormal and normal cases. The GasHis-Transformer model consists of two key modules: A global information module and a local information module to extract histopathological features effectively. In our experiments, a public hematoxylin and eosin (H&E) stained gastric histopathological dataset with 280 abnormal and normal images are divided into training, validation and test sets by a ratio of 1 : 1 : 2. The GasHis-Transformer model is applied to estimate precision, recall, F1-score and accuracy on the test set of gastric histopathological dataset as 98.0%, 100.0%, 96.0% and 98.0%, respectively. Furthermore, a critical study is conducted to evaluate the robustness of GasHis-Transformer, where ten different noises including four adversarial attack and six conventional image noises are added. In addition, a clinically meaningful study is executed to test the gastrointestinal cancer identification performance of GasHis-Transformer with 620 abnormal images and achieves 96.8% accuracy. Finally, a comparative study is performed to test the generalizability with both H&E and immunohistochemical stained images on a lymphoma image dataset and a breast cancer dataset, producing comparable F1-scores (85.6% and 82.8%) and accuracies (83.9% and 89.4%), respectively. In conclusion, GasHisTransformer demonstrates high classification performance and shows its significant potential in the GHIC task.

Computer Vision and Pattern Recognition

A Hierarchical Conditional Random Field-based Attention Mechanism Approach for Gastric Histopathology Image Classification

83 - Yixin Li , Xinran Wu , Chen Li 2021

In the Gastric Histopathology Image Classification (GHIC) tasks, which are usually weakly supervised learning missions, there is inevitably redundant information in the images. Therefore, designing networks that can focus on effective distinguishing features has become a popular research topic. In this paper, to accomplish the tasks of GHIC superiorly and to assist pathologists in clinical diagnosis, an intelligent Hierarchical Conditional Random Field based Attention Mechanism (HCRF-AM) model is proposed. The HCRF-AM model consists of an Attention Mechanism (AM) module and an Image Classification (IC) module. In the AM module, an HCRF model is built to extract attention regions. In the IC module, a Convolutional Neural Network (CNN) model is trained with the attention regions selected and then an algorithm called Classification Probability-based Ensemble Learning is applied to obtain the image-level results from patch-level output of the CNN. In the experiment, a classification specificity of 96.67% is achieved on a gastric histopathology dataset with 700 images. Our HCRF-AM model demonstrates high classification performance and shows its effectiveness and future potential in the GHIC field.

Computer Vision and Pattern Recognition

From visual words to a visual grammar: using language modelling for image classification

316 - Antonio Foncubierta-Rodriguez , Henning Muller , Adrien Depeursinge 2017

The Bag--of--Visual--Words (BoVW) is a visual description technique that aims at shortening the semantic gap by partitioning a low--level feature space into regions of the feature space that potentially correspond to visual concepts and by giving more value to this space. In this paper we present a conceptual analysis of three major properties of language grammar and how they can be adapted to the computer vision and image understanding domain based on the bag of visual words paradigm. Evaluation of the visual grammar shows that a positive impact on classification accuracy and/or descriptor size is obtained when the technique are applied when the proposed techniques are applied.

Computer Vision and Pattern Recognition

Efficient Image Set Classification using Linear Regression based Image Reconstruction

67 - Syed Afaq Ali Shah , Uzair Nadeem , Mohammed Bennamoun 2017

We propose a novel image set classification technique using linear regression models. Downsampled gallery image sets are interpreted as subspaces of a high dimensional space to avoid the computationally expensive training step. We estimate regression models for each test image using the class specific gallery subspaces. Images of the test set are then reconstructed using the regression models. Based on the minimum reconstruction error between the reconstructed and the original images, a weighted voting strategy is used to classify the test set. We performed extensive evaluation on the benchmark UCSD/Honda, CMU Mobo and YouTube Celebrity datasets for face classification, and ETH-80 dataset for object classification. The results demonstrate that by using only a small amount of training data, our technique achieved competitive classification accuracy and superior computational speed compared with the state-of-the-art methods.

Computer Vision and Pattern Recognition

FoveaTer: Foveated Transformer for Image Classification

82 - Aditya Jonnalagadda , William Wang , Miguel P. Eckstein 2021

Many animals and humans process the visual field with a varying spatial resolution (foveated vision) and use peripheral processing to make eye movements and point the fovea to acquire high-resolution information about objects of interest. This architecture results in computationally efficient rapid scene exploration. Recent progress in vision Transformers has brought about new alternatives to the traditionally convolution-reliant computer vision systems. However, these models do not explicitly model the foveated properties of the visual system nor the interaction between eye movements and the classification task. We propose foveated Transformer (FoveaTer) model, which uses pooling regions and saccadic movements to perform object classification tasks using a vision Transformer architecture. Our proposed model pools the image features using squared pooling regions, an approximation to the biologically-inspired foveated architecture, and uses the pooled features as an input to a Transformer Network. It decides on the following fixation location based on the attention assigned by the Transformer to various locations from previous and present fixations. The model uses a confidence threshold to stop scene exploration, allowing to dynamically allocate more fixation/computational resources to more challenging images. We construct an ensemble model using our proposed model and unfoveated model, achieving an accuracy 1.36% below the unfoveated model with 22% computational savings. Finally, we demonstrate our models robustness against adversarial attacks, where it outperforms the unfoveated model.

Computer Vision and Pattern Recognition