Deep Instance-Level Hard Negative Mining Model for Histopathology Images

282 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Meng Li

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Meng Li - Lin Wu - Arnold Wiliem

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Histopathology image analysis can be considered as a Multiple instance learning (MIL) problem, where the whole slide histopathology image (WSI) is regarded as a bag of instances (i.e, patches) and the task is to predict a single class label to the WSI. However, in many real-life applications such as computational pathology, discovering the key instances that trigger the bag label is of great interest because it provides reasons for the decision made by the system. In this paper, we propose a deep convolutional neural network (CNN) model that addresses the primary task of a bag classification on a WSI and also learns to identify the response of each instance to provide interpretable results to the final prediction. We incorporate the attention mechanism into the proposed model to operate the transformation of instances and learn attention weights to allow us to find key patches. To perform a balanced training, we introduce adaptive weighing in each training bag to explicitly adjust the weight distribution in order to concentrate more on the contribution of hard samples. Based on the learned attention weights, we further develop a solution to boost the classification performance by generating the bags with hard negative instances. We conduct extensive experiments on colon and breast cancer histopathology data and show that our framework achieves state-of-the-art performance.

قيم البحث

106 - Jing Ge , Guangyu Gao , 2019

Searching persons in large-scale image databases with the query of natural language description is a more practical important applications in video surveillance. Intuitively, for person search, the core issue should be visual-textual association, whi ch is still an extremely challenging task, due to the contradiction between the high abstraction of textual description and the intuitive expression of visual images. However, for this task, while positive image-text pairs are always well provided, most existing methods doesnt tackle this problem effectively by mining more reasonable negative pairs. In this paper, we proposed a novel visual-textual association approach with visual and textual attention, and cross-modality hardest and semi-hard negative pair mining. In order to evaluate the effectiveness and feasibility of the proposed approach, we conduct extensive experiments on typical person search datasdet: CUHK-PEDES, in which our approach achieves the top1 score of 55.32% as a new state-of-the-art. Besides, we also evaluate the semi-hard pair mining approach in COCO caption dataset, and validate the effectiveness and complementarity of the methods.

الرؤية الحاسوبية وتمييز الأنماط

Lesion Harvester: Iteratively Mining Unlabeled Lesions and Hard-Negative Examples at Scale

322 - Jinzheng Cai , Adam P. Harrison , Youjing Zheng 2020

Acquiring large-scale medical image data, necessary for training machine learning algorithms, is frequently intractable, due to prohibitive expert-driven annotation costs. Recent datasets extracted from hospital archives, e.g., DeepLesion, have begun to address this problem. However, these are often incompletely or noisily labeled, e.g., DeepLesion leaves over 50% of its lesions unlabeled. Thus, effective methods to harvest missing annotations are critical for continued progress in medical image analysis. This is the goal of our work, where we develop a powerful system to harvest missing lesions from the DeepLesion dataset at high precision. Accepting the need for some degree of expert labor to achieve high fidelity, we exploit a small fully-labeled subset of medical image volumes and use it to intelligently mine annotations from the remainder. To do this, we chain together a highly sensitive lesion proposal generator and a very selective lesion proposal classifier. While our framework is generic, we optimize our performance by proposing a 3D contextual lesion proposal generator and by using a multi-view multi-scale lesion proposal classifier. These produce harvested and hard-negative proposals, which we then re-use to finetune our proposal generator by using a novel hard negative suppression loss, continuing this process until no extra lesions are found. Extensive experimental analysis demonstrates that our method can harvest an additional 9,805 lesions while keeping precision above 90%. To demonstrate the benefits of our approach, we show that lesion detectors trained on our harvested lesions can significantly outperform the same variants only trained on the original annotations, with boost of average precision of 7% to 10%. We open source our annotations at https://github.com/JimmyCai91/DeepLesionAnnotation.

الرؤية الحاسوبية وتمييز الأنماط

Towards Instance-level Image-to-Image Translation

141 - Zhiqiang Shen , Mingyang Huang , Jianping Shi 2019

Unpaired Image-to-image Translation is a new rising and challenging vision problem that aims to learn a mapping between unaligned image pairs in diverse domains. Recent advances in this field like MUNIT and DRIT mainly focus on disentangling content and style/attribute from a given image first, then directly adopting the global style to guide the model to synthesize new domain images. However, this kind of approaches severely incurs contradiction if the target domain images are content-rich with multiple discrepant objects. In this paper, we present a simple yet effective instance-aware image-to-image translation approach (INIT), which employs the fine-grained local (instance) and global styles to the target image spatially. The proposed INIT exhibits three import advantages: (1) the instance-level objective loss can help learn a more accurate reconstruction and incorporate diverse attributes of objects; (2) the styles used for target domain of local/global areas are from corresponding spatial regions in source domain, which intuitively is a more reasonable mapping; (3) the joint training process can benefit both fine and coarse granularity and incorporates instance information to improve the quality of global translation. We also collect a large-scale benchmark for the new instance-level translation task. We observe that our synthetic images can even benefit real-world vision tasks like generic object detection.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

LoOp: Looking for Optimal Hard Negative Embeddings for Deep Metric Learning

134 - Bhavya Vasudeva , Puneesh Deora , Saumik Bhattacharya 2021

Deep metric learning has been effectively used to learn distance metrics for different visual tasks like image retrieval, clustering, etc. In order to aid the training process, existing methods either use a hard mining strategy to extract the most in formative samples or seek to generate hard synthetics using an additional network. Such approaches face different challenges and can lead to biased embeddings in the former case, and (i) harder optimization (ii) slower training speed (iii) higher model complexity in the latter case. In order to overcome these challenges, we propose a novel approach that looks for optimal hard negatives (LoOp) in the embedding space, taking full advantage of each tuple by calculating the minimum distance between a pair of positives and a pair of negatives. Unlike mining-based methods, our approach considers the entire space between pairs of embeddings to calculate the optimal hard negatives. Extensive experiments combining our approach and representative metric learning losses reveal a significant boost in performance on three benchmark datasets.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Improving memory banks for unsupervised learning with large mini-batch, consistency and hard negative mining

96 - Adrian Bulat , Enrique Sanchez-Lozano , Georgios Tzimiropoulos 2021

An important component of unsupervised learning by instance-based discrimination is a memory bank for storing a feature representation for each training sample in the dataset. In this paper, we introduce 3 improvements to the vanilla memory bank-base d formulation which brings massive accuracy gains: (a) Large mini-batch: we pull multiple augmentations for each sample within the same batch and show that this leads to better models and enhanced memory bank updates. (b) Consistency: we enforce the logits obtained by different augmentations of the same sample to be close without trying to enforce discrimination with respect to negative samples as proposed by previous approaches. (c) Hard negative mining: since instance discrimination is not meaningful for samples that are too visually similar, we devise a novel nearest neighbour approach for improving the memory bank that gradually merges extremely similar data samples that were previously forced to be apart by the instance level classification loss. Overall, our approach greatly improves the vanilla memory-bank based instance discrimination and outperforms all existing methods for both seen and unseen testing categories with cosine similarity.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي