ترغب بنشر مسار تعليمي؟ اضغط هنا

To enable a deep learning-based system to be used in the medical domain as a computer-aided diagnosis system, it is essential to not only classify diseases but also present the locations of the diseases. However, collecting instance-level annotations for various thoracic diseases is expensive. Therefore, weakly supervised localization methods have been proposed that use only image-level annotation. While the previous methods presented the disease location as the most discriminative part for classification, this causes a deep network to localize wrong areas for indistinguishable X-ray images. To solve this issue, we propose a spatial attention method using disease masks that describe the areas where diseases mainly occur. We then apply the spatial attention to find the precise disease area by highlighting the highest probability of disease occurrence. Meanwhile, the various sizes, rotations and noise in chest X-ray images make generating the disease masks challenging. To reduce the variation among images, we employ an alignment module to transform an input X-ray image into a generalized image. Through extensive experiments on the NIH-Chest X-ray dataset with eight kinds of diseases, we show that the proposed method results in superior localization performances compared to state-of-the-art methods.
Conventional methods for object detection usually require substantial amounts of training data and annotated bounding boxes. If there are only a few training data and annotations, the object detectors easily overfit and fail to generalize. It exposes the practical weakness of the object detectors. On the other hand, human can easily master new reasoning rules with only a few demonstrations using previously learned knowledge. In this paper, we introduce a few-shot object detection via knowledge transfer, which aims to detect objects from a few training examples. Central to our method is prototypical knowledge transfer with an attached meta-learner. The meta-learner takes support set images that include the few examples of the novel categories and base categories, and predicts prototypes that represent each category as a vector. Then, the prototypes reweight each RoI (Region-of-Interest) feature vector from a query image to remodels R-CNN predictor heads. To facilitate the remodeling process, we predict the prototypes under a graph structure, which propagates information of the correlated base categories to the novel categories with explicit guidance of prior knowledge that represents correlations among categories. Extensive experiments on the PASCAL VOC dataset verifies the effectiveness of the proposed method.
To understand the black-box characteristics of deep networks, counterfactual explanation that deduces not only the important features of an input space but also how those features should be modified to classify input as a target class has gained an i ncreasing interest. The patterns that deep networks have learned from a training dataset can be grasped by observing the feature variation among various classes. However, current approaches perform the feature modification to increase the classification probability for the target class irrespective of the internal characteristics of deep networks. This often leads to unclear explanations that deviate from real-world data distributions. To address this problem, we propose a counterfactual explanation method that exploits the statistics learned from a training dataset. Especially, we gradually construct an explanation by iterating over masking and composition steps. The masking step aims to select an important feature from the input data to be classified as a target class. Meanwhile, the composition step aims to optimize the previously selected feature by ensuring that its output score is close to the logit space of the training data that are classified as the target class. Experimental results show that our method produces human-friendly interpretations on various classification datasets and verify that such interpretations can be achieved with fewer feature modification.
Few-shot learning aims to classify unseen classes with a few training examples. While recent works have shown that standard mini-batch training with a carefully designed training strategy can improve generalization ability for unseen classes, well-kn own problems in deep networks such as memorizing training statistics have been less explored for few-shot learning. To tackle this issue, we propose self-augmentation that consolidates self-mix and self-distillation. Specifically, we exploit a regional dropout technique called self-mix, in which a patch of an image is substituted into other values in the same image. Then, we employ a backbone network that has auxiliary branches with its own classifier to enforce knowledge sharing. Lastly, we present a local representation learner to further exploit a few training examples for unseen classes. Experimental results show that the proposed method outperforms the state-of-the-art methods for prevalent few-shot benchmarks and improves the generalization ability.
In this article, we consider the problem of few-shot learning for classification. We assume a network trained for base categories with a large number of training examples, and we aim to add novel categories to it that have only a few, e.g., one or fi ve, training examples. This is a challenging scenario because: 1) high performance is required in both the base and novel categories; and 2) training the network for the new categories with a few training examples can contaminate the feature space trained well for the base categories. To address these challenges, we propose two geometric constraints to fine-tune the network with a few training examples. The first constraint enables features of the novel categories to cluster near the category weights, and the second maintains the weights of the novel categories far from the weights of the base categories. By applying the proposed constraints, we extract discriminative features for the novel categories while preserving the feature space learned for the base categories. Using public data sets for few-shot learning that are subsets of ImageNet, we demonstrate that the proposed method outperforms prevalent methods by a large margin.
103 - Sin-Han Kang , Hong-Gyu Jung , 2019
In an effort to interpret black-box models, researches for developing explanation methods have proceeded in recent years. Most studies have tried to identify input pixels that are crucial to the prediction of a classifier. While this approach is mean ingful to analyse the characteristic of blackbox models, it is also important to investigate pixels that interfere with the prediction. To tackle this issue, in this paper, we propose an explanation method that visualizes undesirable regions to classify an image as a target class. To be specific, we divide the concept of undesirable regions into two terms: (1) factors for a target class, which hinder that black-box models identify intrinsic characteristics of a target class and (2) factors for non-target classes that are important regions for an image to be classified as other classes. We visualize such undesirable regions on heatmaps to qualitatively validate the proposed method. Furthermore, we present an evaluation metric to provide quantitative results on ImageNet.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا