Coronary Calcium Detection using 3D Attention Identical Dual Deep Network Based on Weakly Supervised Learning

74 0 0.0 ( 0 )

Download Cite

Added by Yuankai Huo

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Yuankai Huo - James G. Terry - Jiachen Wang

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Coronary artery calcium (CAC) is biomarker of advanced subclinical coronary artery disease and predicts myocardial infarction and death prior to age 60 years. The slice-wise manual delineation has been regarded as the gold standard of coronary calcium detection. However, manual efforts are time and resource consuming and even impracticable to be applied on large-scale cohorts. In this paper, we propose the attention identical dual network (AID-Net) to perform CAC detection using scan-rescan longitudinal non-contrast CT scans with weakly supervised attention by only using per scan level labels. To leverage the performance, 3D attention mechanisms were integrated into the AID-Net to provide complementary information for classification tasks. Moreover, the 3D Gradient-weighted Class Activation Mapping (Grad-CAM) was also proposed at the testing stage to interpret the behaviors of the deep neural network. 5075 non-contrast chest CT scans were used as training, validation and testing datasets. Baseline performance was assessed on the same cohort. From the results, the proposed AID-Net achieved the superior performance on classification accuracy (0.9272) and AUC (0.9627).

rate research

An automatic deep learning approach for coronary artery calcium segmentation

108 - G. Santini , D. Della Latta , N. Martini 2017

Coronary artery calcium (CAC) is a significant marker of atherosclerosis and cardiovascular events. In this work we present a system for the automatic quantification of calcium score in ECG-triggered non-contrast enhanced cardiac computed tomography (CT) images. The proposed system uses a supervised deep learning algorithm, i.e. convolutional neural network (CNN) for the segmentation and classification of candidate lesions as coronary or not, previously extracted in the region of the heart using a cardiac atlas. We trained our network with 45 CT volumes; 18 volumes were used to validate the model and 56 to test it. Individual lesions were detected with a sensitivity of 91.24%, a specificity of 95.37% and a positive predicted value (PPV) of 90.5%; comparing calcium score obtained by the system and calcium score manually evaluated by an expert operator, a Pearson coefficient of 0.983 was obtained. A high agreement (Cohens k = 0.879) between manual and automatic risk prediction was also observed. These results demonstrated that convolutional neural networks can be effectively applied for the automatic segmentation and classification of coronary calcifications.

Computer Vision and Pattern Recognition

Dual-attention Focused Module for Weakly Supervised Object Localization

126 - Yukun Zhou , Zailiang Chen , Hailan Shen 2019

The research on recognizing the most discriminative regions provides referential information for weakly supervised object localization with only image-level annotations. However, the most discriminative regions usually conceal the other parts of the object, thereby impeding entire object recognition and localization. To tackle this problem, the Dual-attention Focused Module (DFM) is proposed to enhance object localization performance. Specifically, we present a dual attention module for information fusion, consisting of a position branch and a channel one. In each branch, the input feature map is deduced into an enhancement map and a mask map, thereby highlighting the most discriminative parts or hiding them. For the position mask map, we introduce a focused matrix to enhance it, which utilizes the principle that the pixels of an object are continuous. Between these two branches, the enhancement map is integrated with the mask map, aiming at partially compensating the lost information and diversifies the features. With the dual-attention module and focused matrix, the entire object region could be precisely recognized with implicit information. We demonstrate outperforming results of DFM in experiments. In particular, DFM achieves state-of-the-art performance in localization accuracy in ILSVRC 2016 and CUB-200-2011.

Computer Vision and Pattern Recognition

Weakly Supervised Attention Learning for Textual Phrases Grounding

87 - Zhiyuan Fang , Shu Kong , Tianshu Yu 2018

Grounding textual phrases in visual content is a meaningful yet challenging problem with various potential applications such as image-text inference or text-driven multimedia interaction. Most of the current existing methods adopt the supervised learning mechanism which requires ground-truth at pixel level during training. However, fine-grained level ground-truth annotation is quite time-consuming and severely narrows the scope for more general applications. In this extended abstract, we explore methods to localize flexibly image regions from the top-down signal (in a form of one-hot label or natural languages) with a weakly supervised attention learning mechanism. In our model, two types of modules are utilized: a backbone module for visual feature capturing, and an attentive module generating maps based on regularized bilinear pooling. We construct the model in an end-to-end fashion which is trained by encouraging the spatial attentive map to shift and focus on the region that consists of the best matched visual features with the top-down signal. We demonstrate the preliminary yet promising results on a testbed that is synthesized with multi-label MNIST data.

Computer Vision and Pattern Recognition

Comprehensive Attention Self-Distillation for Weakly-Supervised Object Detection

307 - Zeyi Huang , Yang Zou , Vijayakumar Bhagavatula 2020

Weakly Supervised Object Detection (WSOD) has emerged as an effective tool to train object detectors using only the image-level category labels. However, without object-level labels, WSOD detectors are prone to detect bounding boxes on salient objects, clustered objects and discriminative object parts. Moreover, the image-level category labels do not enforce consistent object detection across different transformations of the same images. To address the above issues, we propose a Comprehensive Attention Self-Distillation (CASD) training approach for WSOD. To balance feature learning among all object instances, CASD computes the comprehensive attention aggregated from multiple transformations and feature layers of the same images. To enforce consistent spatial supervision on objects, CASD conducts self-distillation on the WSOD networks, such that the comprehensive attention is approximated simultaneously by multiple transformations and feature layers of the same images. CASD produces new state-of-the-art WSOD results on standard benchmarks such as PASCAL VOC 2007/2012 and MS-COCO.

Computer Vision and Pattern Recognition Machine Learning

Weakly Supervised 3D Object Detection from Point Clouds

137 - Zengyi Qin , Jinglu Wang , Yan Lu 2020

A crucial task in scene understanding is 3D object detection, which aims to detect and localize the 3D bounding boxes of objects belonging to specific classes. Existing 3D object detectors heavily rely on annotated 3D bounding boxes during training, while these annotations could be expensive to obtain and only accessible in limited scenarios. Weakly supervised learning is a promising approach to reducing the annotation requirement, but existing weakly supervised object detectors are mostly for 2D detection rather than 3D. In this work, we propose VS3D, a framework for weakly supervised 3D object detection from point clouds without using any ground truth 3D bounding box for training. First, we introduce an unsupervised 3D proposal module that generates object proposals by leveraging normalized point cloud densities. Second, we present a cross-modal knowledge distillation strategy, where a convolutional neural network learns to predict the final results from the 3D object proposals by querying a teacher network pretrained on image datasets. Comprehensive experiments on the challenging KITTI dataset demonstrate the superior performance of our VS3D in diverse evaluation settings. The source code and pretrained models are publicly available at https://github.com/Zengyi-Qin/Weakly-Supervised-3D-Object-Detection.

Computer Vision and Pattern Recognition