ﻻ يوجد ملخص باللغة العربية
In this work, we exploit the task of joint classification and weakly supervised localization of thoracic diseases from chest radiographs, with only image-level disease labels coupled with disease severity-level (DSL) information of a subset. A convolutional neural network (CNN) based attention-guided curriculum learning (AGCL) framework is presented, which leverages the severity-level attributes mined from radiology reports. Images in order of difficulty (grouped by different severity-levels) are fed to CNN to boost the learning gradually. In addition, highly confident samples (measured by classification probabilities) and their corresponding class-conditional heatmaps (generated by the CNN) are extracted and further fed into the AGCL framework to guide the learning of more distinctive convolutional features in the next iteration. A two-path network architecture is designed to regress the heatmaps from selected seed samples in addition to the original classification task. The joint learning scheme can improve the classification and localization performance along with more seed samples for the next iteration. We demonstrate the effectiveness of this iterative refinement framework via extensive experimental evaluations on the publicly available ChestXray14 dataset. AGCL achieves over 5.7% (averaged over 14 diseases) increase in classification AUC and 7%/11% increases in Recall/Precision for the localization task compared to the state of the art.
To enable a deep learning-based system to be used in the medical domain as a computer-aided diagnosis system, it is essential to not only classify diseases but also present the locations of the diseases. However, collecting instance-level annotations
Although recent advances in deep learning accelerated an improvement in a weakly supervised object localization (WSOL) task, there are still challenges to identify the entire body of an object, rather than only discriminative parts. In this paper, we
Weakly Supervised Object Localization (WSOL) techniques learn the object location only using image-level labels, without location annotations. A common limitation for these techniques is that they cover only the most discriminative part of the object
The research on recognizing the most discriminative regions provides referential information for weakly supervised object localization with only image-level annotations. However, the most discriminative regions usually conceal the other parts of the
Weakly-supervised temporal action localization is a problem of learning an action localization model with only video-level action labeling available. The general framework largely relies on the classification activation, which employs an attention mo