Dermo-DOCTOR: A framework for concurrent skin lesion detection and recognition using a deep convolutional neural network with end-to-end dual encoders

275 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Md. Kamrul Hasan

تاريخ النشر 2021

مجال البحث هندسة إلكترونية الهندسة المعلوماتية

والبحث باللغة English

تأليف Md. Kamrul Hasan - Shidhartho Roy - Chayan Mondal

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Automated skin lesion analysis for simultaneous detection and recognition is still challenging for inter-class homogeneity and intra-class heterogeneity, leading to low generic capability of a Single Convolutional Neural Network (CNN) with limited datasets. This article proposes an end-to-end deep CNN-based framework for simultaneous detection and recognition of the skin lesions, named Dermo-DOCTOR, consisting of two encoders. The feature maps from two encoders are fused channel-wise, called Fused Feature Map (FFM). The FFM is utilized for decoding in the detection sub-network, concatenating each stage of two encoders outputs with corresponding decoder layers to retrieve the lost spatial information due to pooling in the encoders. For the recognition sub-network, the outputs of three fully connected layers, utilizing feature maps of two encoders and FFM, are aggregated to obtain a final lesion class. We train and evaluate the proposed Dermo-Doctor utilizing two publicly available benchmark datasets, such as ISIC-2016 and ISIC-2017. The achieved segmentation results exhibit mean intersection over unions of 85.0 % and 80.0 % respectively for ISIC-2016 and ISIC-2017 test datasets. The proposed Dermo-DOCTOR also demonstrates praiseworthy success in lesion recognition, providing the areas under the receiver operating characteristic curves of 0.98 and 0.91 respectively for those two datasets. The experimental results show that the proposed Dermo-DOCTOR outperforms the alternative methods mentioned in the literature, designed for skin lesion detection and recognition. As the Dermo-DOCTOR provides better-results on two different test datasets, even with limited training data, it can be an auspicious computer-aided assistive tool for dermatologists.

قيم البحث

128 - Makena Low , Priyanka Raina 2019

For several skin conditions such as vitiligo, accurate segmentation of lesions from skin images is the primary measure of disease progression and severity. Existing methods for vitiligo lesion segmentation require manual intervention. Unfortunately, manual segmentation is time and labor-intensive, as well as irreproducible between physicians. We introduce a convolutional neural network (CNN) that quickly and robustly performs vitiligo skin lesion segmentation. Our CNN has a U-Net architecture with a modified contracting path. We use the CNN to generate an initial segmentation of the lesion, then refine it by running the watershed algorithm on high-confidence pixels. We train the network on 247 images with a variety of lesion sizes, complexity, and anatomical sites. The network with our modifications noticeably outperforms the state-of-the-art U-Net, with a Jaccard Index (JI) score of 73.6% (compared to 36.7%). Moreover, our method requires only a few seconds for segmentation, in contrast with the previously proposed semi-autonomous watershed approach, which requires 2-29 minutes per image.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Knowledge-aware Deep Framework for Collaborative Skin Lesion Segmentation and Melanoma Recognition

204 - Xiaohong Wang , Xudong Jiang , Henghui Ding 2021

Deep learning techniques have shown their superior performance in dermatologist clinical inspection. Nevertheless, melanoma diagnosis is still a challenging task due to the difficulty of incorporating the useful dermatologist clinical knowledge into the learning process. In this paper, we propose a novel knowledge-aware deep framework that incorporates some clinical knowledge into collaborative learning of two important melanoma diagnosis tasks, i.e., skin lesion segmentation and melanoma recognition. Specifically, to exploit the knowledge of morphological expressions of the lesion region and also the periphery region for melanoma identification, a lesion-based pooling and shape extraction (LPSE) scheme is designed, which transfers the structure information obtained from skin lesion segmentation into melanoma recognition. Meanwhile, to pass the skin lesion diagnosis knowledge from melanoma recognition to skin lesion segmentation, an effective diagnosis guided feature fusion (DGFF) strategy is designed. Moreover, we propose a recursive mutual learning mechanism that further promotes the inter-task cooperation, and thus iteratively improves the joint learning capability of the model for both skin lesion segmentation and melanoma recognition. Experimental results on two publicly available skin lesion datasets show the effectiveness of the proposed method for melanoma analysis.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Matthews Correlation Coefficient Loss for Deep Convolutional Networks: Application to Skin Lesion Segmentation

192 - Kumar Abhishek , Ghassan Hamarneh 2020

The segmentation of skin lesions is a crucial task in clinical decision support systems for the computer aided diagnosis of skin lesions. Although deep learning-based approaches have improved segmentation performance, these models are often susceptib le to class imbalance in the data, particularly, the fraction of the image occupied by the background healthy skin. Despite variations of the popular Dice loss function being proposed to tackle the class imbalance problem, the Dice loss formulation does not penalize misclassifications of the background pixels. We propose a novel metric-based loss function using the Matthews correlation coefficient, a metric that has been shown to be efficient in scenarios with skewed class distributions, and use it to optimize deep segmentation models. Evaluations on three skin lesion image datasets: the ISBI ISIC 2017 Skin Lesion Segmentation Challenge dataset, the DermoFit Image Library, and the PH2 dataset, show that models trained using the proposed loss function outperform those trained using Dice loss by 11.25%, 4.87%, and 0.76% respectively in the mean Jaccard index. The code is available at https://github.com/kakumarabhishek/MCC-Loss.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

End-to-end Neural Video Coding Using a Compound Spatiotemporal Representation

424 - Haojie Liu , Ming Lu , Zhiqi Chen 2021

Recent years have witnessed rapid advances in learnt video coding. Most algorithms have solely relied on the vector-based motion representation and resampling (e.g., optical flow based bilinear sampling) for exploiting the inter frame redundancy. In spite of the great success of adaptive kernel-based resampling (e.g., adaptive convolutions and deformable convolutions) in video prediction for uncompressed videos, integrating such approaches with rate-distortion optimization for inter frame coding has been less successful. Recognizing that each resampling solution offers unique advantages in regions with different motion and texture characteristics, we propose a hybrid motion compensation (HMC) method that adaptively combines the predictions generated by these two approaches. Specifically, we generate a compound spatiotemporal representation (CSTR) through a recurrent information aggregation (RIA) module using information from the current and multiple past frames. We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements, and combines them adaptively to achieve more accurate inter prediction. Experiments show that our proposed inter coding system can provide better motion-compensated prediction and is more robust to occlusions and complex motions. Together with jointly trained intra coder and residual coder, the overall learnt hybrid coder yields the state-of-the-art coding efficiency in low-delay scenario, compared to the traditional H.264/AVC and H.265/HEVC, as well as recently published learning-based methods, in terms of both PSNR and MS-SSIM metrics.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks

471 - Ying Zhang , Mohammad Pezeshki , Philemon Brakel 2017

Convolutional Neural Networks (CNNs) are effective models for reducing spectral variations and modeling spectral correlations in acoustic features for automatic speech recognition (ASR). Hybrid speech recognition systems incorporating CNNs with Hidde n Markov Models/Gaussian Mixture Models (HMMs/GMMs) have achieved the state-of-the-art in various benchmarks. Meanwhile, Connectionist Temporal Classification (CTC) with Recurrent Neural Networks (RNNs), which is proposed for labeling unsegmented sequences, makes it feasible to train an end-to-end speech recognition system instead of hybrid settings. However, RNNs are computationally expensive and sometimes difficult to train. In this paper, inspired by the advantages of both CNNs and the CTC approach, we propose an end-to-end speech framework for sequence labeling, by combining hierarchical CNNs with CTC directly without recurrent connections. By evaluating the approach on the TIMIT phoneme recognition task, we show that the proposed model is not only computationally efficient, but also competitive with the existing baseline systems. Moreover, we argue that CNNs have the capability to model temporal correlations with appropriate context information.

الحساب واللغة التعلم الآلي التعلم الالي