Crowd disagreement about medical images is informative

59 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Veronika Cheplygina

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Veronika Cheplygina - Josien P. W. Pluim

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Classifiers for medical image analysis are often trained with a single consensus label, based on combining labels given by experts or crowds. However, disagreement between annotators may be informative, and thus removing it may not be the best strategy. As a proof of concept, we predict whether a skin lesion from the ISIC 2017 dataset is a melanoma or not, based on crowd annotations of visual characteristics of that lesion. We compare using the mean annotations, illustrating consensus, to standard deviations and other distribution moments, illustrating disagreement. We show that the mean annotations perform best, but that the disagreement measures are still informative. We also make the crowd annotations used in this paper available at url{https://figshare.com/s/5cbbce14647b66286544}.

قيم البحث

132 - Zeyu Wang , Berthy Feng , Karthik Narasimhan 2020

Despite considerable progress, state of the art image captioning models produce generic captions, leaving out important image details. Furthermore, these systems may even misrepresent the image in order to produce a simpler caption consisting of comm on concepts. In this paper, we first analyze both modern captioning systems and evaluation metrics through empirical experiments to quantify these phenomena. We find that modern captioning systems return higher likelihoods for incorrect distractor sentences compared to ground truth captions, and that evaluation metrics like SPICE can be topped using simple captioning systems relying on object detectors. Inspired by these observations, we design a new metric (SPICE-U) by introducing a notion of uniqueness over the concepts generated in a caption. We show that SPICE-U is better correlated with human judgements compared to SPICE, and effectively captures notions of diversity and descriptiveness. Finally, we also demonstrate a general technique to improve any existing captioning model -- by using mutual information as a re-ranking objective during decoding. Empirically, this results in more unique and informative captions, and improves three different state-of-the-art models on SPICE-U as well as average score over existing metrics.

الرؤية الحاسوبية وتمييز الأنماط

Unsupervised Local Discrimination for Medical Images

159 - Huai Chen , Renzhen Wang , Jieyu Li 2021

Contrastive representation learning is an effective unsupervised method to alleviate the demand for expensive annotated data in medical image processing. Recent work mainly based on instance-wise discrimination to learn global features, while neglect local details, which limit their application in processing tiny anatomical structures, tissues and lesions. Therefore, we aim to propose a universal local discrmination framework to learn local discriminative features to effectively initialize medical models, meanwhile, we systematacially investigate its practical medical applications. Specifically, based on the common property of intra-modality structure similarity, i.e. similar structures are shared among the same modality images, a systematic local feature learning framework is proposed. Instead of making instance-wise comparisons based on global embedding, our method makes pixel-wise embedding and focuses on measuring similarity among patches and regions. The finer contrastive rule makes the learnt representation more generalized for segmentation tasks and outperform extensive state-of-the-art methods by wining 11 out of all 12 downstream tasks in color fundus and chest X-ray. Furthermore, based on the property of inter-modality shape similarity, i.e. structures may share similar shape although in different medical modalities, we joint across-modality shape prior into region discrimination to realize unsupervised segmentation. It shows the feaibility of segmenting target only based on shape description from other modalities and inner pattern similarity provided by region discrimination. Finally, we enhance the center-sensitive ability of patch discrimination by introducing center-sensitive averaging to realize one-shot landmark localization, this is an effective application for patch discrimination.

الرؤية الحاسوبية وتمييز الأنماط

Crowd Counting on Images with Scale Variation and Isolated Clusters

59 - Haoyue Bai , Song Wen , S.-H. Gary Chan 2019

Crowd counting is to estimate the number of objects (e.g., people or vehicles) in an image of unconstrained congested scenes. Designing a general crowd counting algorithm applicable to a wide range of crowd images is challenging, mainly due to the po ssibly large variation in object scales and the presence of many isolated small clusters. Previous approaches based on convolution operations with multi-branch architecture are effective for only some narrow bands of scales and have not captured the long-range contextual relationship due to isolated clustering. To address that, we propose SACANet, a novel scale-adaptive long-range context-aware network for crowd counting. SACANet consists of three major modules: the pyramid contextual module which extracts long-range contextual information and enlarges the receptive field, a scale-adaptive self-attention multi-branch module to attain high scale sensitivity and detection accuracy of isolated clusters, and a hierarchical fusion module to fuse multi-level self-attention features. With group normalization, SACANet achieves better optimality in the training process. We have conducted extensive experiments using the VisDrone2019 People dataset, the VisDrone2019 Vehicle dataset, and some other challenging benchmarks. As compared with the state-of-the-art methods, SACANet is shown to be effective, especially for extremely crowded conditions with diverse scales and scattered clusters, and achieves much lower MAE as compared with baselines.

الرؤية الحاسوبية وتمييز الأنماط

FDRN: A Fast Deformable Registration Network for Medical Images

218 - Kaicong Sun , Sven Simon 2020

Deformable image registration is a fundamental task in medical imaging. Due to the large computational complexity of deformable registration of volumetric images, conventional iterative methods usually face the tradeoff between the registration accur acy and the computation time in practice. In order to boost the registration performance in both accuracy and runtime, we propose a fast convolutional neural network. Specially, to efficiently utilize the memory resources and enlarge the model capacity, we adopt additive forwarding instead of channel concatenation and deepen the network in each encoder and decoder stage. To facilitate the learning efficiency, we leverage skip connection within the encoder and decoder stages to enable residual learning and employ an auxiliary loss at the bottom layer with lowest resolution to involve deep supervision. Particularly, the low-resolution auxiliary loss is weighted by an exponentially decayed parameter during the training phase. In conjunction with the main loss in high-resolution grid, a coarse-to-fine learning strategy is achieved. Last but not least, we introduce an auxiliary loss based on the segmentation prior to improve the registration performance in Dice score. Comparing to the auxiliary loss using average Dice score, the proposed multi-label segmentation loss does not induce additional memory cost in the training phase and can be employed on images with arbitrary amount of categories. In the experiments, we show FDRN outperforms the existing state-of-the-art registration methods for brain MR images by resorting to the compact network structure and efficient learning. Besides, FDRN is a generalized framework for image registration which is not confined to a particular type of medical images or anatomy.

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical Images

230 - Ayaan Haque , Abdullah-Al-Zubaer Imran , Adam Wang 2020

Semi-supervised learning via learning from limited quantities of labeled data has been investigated as an alternative to supervised counterparts. Maximizing knowledge gains from copious unlabeled data benefit semi-supervised learning settings. Moreov er, learning multiple tasks within the same model further improves model generalizability. We propose a novel multitask learning model, namely MultiMix, which jointly learns disease classification and anatomical segmentation in a sparingly supervised manner, while preserving explainability through bridge saliency between the two tasks. Our extensive experimentation with varied quantities of labeled data in the training sets justify the effectiveness of our multitasking model for the classification of pneumonia and segmentation of lungs from chest X-ray images. Moreover, both in-domain and cross-domain evaluations across the tasks further showcase the potential of our model to adapt to challenging generalization scenarios.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة العربية الخاصة للعلوم والتكنولوجيا

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Crowd disagreement about medical images is informative

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً