No Arabic abstract
A conditional random field (CRF) model for cloud detection in ground based sky images is presented. We show that very high cloud detection accuracy can be achieved by combining a discriminative classifier and a higher order clique potential in a CRF framework. The image is first divided into homogeneous regions using a mean shift clustering algorithm and then a CRF model is defined over these regions. The various parameters involved are estimated using training data and the inference is performed using Iterated Conditional Modes (ICM) algorithm. We demonstrate how taking spatial context into account can boost the accuracy. We present qualitative and quantitative results to prove the superior performance of this framework in comparison with other state of the art methods applied for cloud detection.
Object retrieval and reconstruction from very high resolution (VHR) synthetic aperture radar (SAR) images are of great importance for urban SAR applications, yet highly challenging owing to the complexity of SAR data. This paper addresses the issue of individual building segmentation from a single VHR SAR image in large-scale urban areas. To achieve this, we introduce building footprints from GIS data as complementary information and propose a novel conditional GIS-aware network (CG-Net). The proposed model learns multi-level visual features and employs building footprints to normalize the features for predicting building masks in the SAR image. We validate our method using a high resolution spotlight TerraSAR-X image collected over Berlin. Experimental results show that the proposed CG-Net effectively brings improvements with variant backbones. We further compare two representations of building footprints, namely complete building footprints and sensor-visible footprint segments, for our task, and conclude that the use of the former leads to better segmentation results. Moreover, we investigate the impact of inaccurate GIS data on our CG-Net, and this study shows that CG-Net is robust against positioning errors in GIS data. In addition, we propose an approach of ground truth generation of buildings from an accurate digital elevation model (DEM), which can be used to generate large-scale SAR image datasets. The segmentation results can be applied to reconstruct 3D building models at level-of-detail (LoD) 1, which is demonstrated in our experiments.
Diabetes foot ulceration (DFU) and amputation are a cause of significant morbidity. The prevention of DFU may be achieved by the identification of patients at risk of DFU and the institution of preventative measures through education and offloading. Several studies have reported that thermogram images may help to detect an increase in plantar temperature prior to DFU. However, the distribution of plantar temperature may be heterogeneous, making it difficult to quantify and utilize to predict outcomes. We have compared a machine learning-based scoring technique with feature selection and optimization techniques and learning classifiers to several state-of-the-art Convolutional Neural Networks (CNNs) on foot thermogram images and propose a robust solution to identify the diabetic foot. A comparatively shallow CNN model, MobilenetV2 achieved an F1 score of ~95% for a two-feet thermogram image-based classification and the AdaBoost Classifier used 10 features and achieved an F1 score of 97 %. A comparison of the inference time for the best-performing networks confirmed that the proposed algorithm can be deployed as a smartphone application to allow the user to monitor the progression of the DFU in a home setting.
Oversight in medical images is a crucial problem, and timely reporting of medical images is desired. Therefore, an all-purpose anomaly detection method that can detect virtually all types of lesions/diseases in a given image is strongly desired. However, few commercially available and versatile anomaly detection methods for medical images have been provided so far. Recently, anomaly detection methods built upon deep learning methods have been rapidly growing in popularity, and these methods seem to provide reasonable solutions to the problem. However, the workload to label the images necessary for training in deep learning remains heavy. In this study, we present an anomaly detection method based on two trained flow-based generative models. With this method, the posterior probability can be computed as a normality metric for any given image. The training of the generative models requires two sets of images: a set containing only normal images and another set containing both normal and abnormal images without any labels. In the latter set, each sample does not have to be labeled as normal or abnormal; therefore, any mixture of images (e.g., all cases in a hospital) can be used as the dataset without cumbersome manual labeling. The method was validated with two types of medical images: chest X-ray radiographs (CXRs) and brain computed tomographies (BCTs). The areas under the receiver operating characteristic curves for logarithm posterior probabilities of CXRs (0.868 for pneumonia-like opacities) and BCTs (0.904 for infarction) were comparable to those in previous studies with other anomaly detection methods. This result showed the versatility of our method.
Purpose: To develop a Breast Imaging Reporting and Data System (BI-RADS) breast density deep learning (DL) model in a multi-site setting for synthetic two-dimensional mammography (SM) images derived from digital breast tomosynthesis exams using full-field digital mammography (FFDM) images and limited SM data. Materials and Methods: A DL model was trained to predict BI-RADS breast density using FFDM images acquired from 2008 to 2017 (Site 1: 57492 patients, 187627 exams, 750752 images) for this retrospective study. The FFDM model was evaluated using SM datasets from two institutions (Site 1: 3842 patients, 3866 exams, 14472 images, acquired from 2016 to 2017; Site 2: 7557 patients, 16283 exams, 63973 images, 2015 to 2019). Each of the three datasets were then split into training, validation, and test datasets. Adaptation methods were investigated to improve performance on the SM datasets and the effect of dataset size on each adaptation method is considered. Statistical significance was assessed using confidence intervals (CI), estimated by bootstrapping. Results: Without adaptation, the model demonstrated substantial agreement with the original reporting radiologists for all three datasets (Site 1 FFDM: linearly-weighted $kappa_w$ = 0.75 [95% CI: 0.74, 0.76]; Site 1 SM: $kappa_w$ = 0.71 [95% CI: 0.64, 0.78]; Site 2 SM: $kappa_w$ = 0.72 [95% CI: 0.70, 0.75]). With adaptation, performance improved for Site 2 (Site 1: $kappa_w$ = 0.72 [95% CI: 0.66, 0.79], 0.71 vs 0.72, P = .80; Site 2: $kappa_w$ = 0.79 [95% CI: 0.76, 0.81], 0.72 vs 0.79, P $<$ .001) using only 500 SM images from that site. Conclusion: A BI-RADS breast density DL model demonstrated strong performance on FFDM and SM images from two institutions without training on SM images and improved using few SM images.
Classification of malignancy for breast cancer and other cancer types is usually tackled as an object detection problem: Individual lesions are first localized and then classified with respect to malignancy. However, the drawback of this approach is that abstract features incorporating several lesions and areas that are not labelled as a lesion but contain global medically relevant information are thus disregarded: especially for dynamic contrast-enhanced breast MRI, criteria such as background parenchymal enhancement and location within the breast are important for diagnosis and cannot be captured by object detection approaches properly. In this work, we propose a 3D CNN and a multi scale curriculum learning strategy to classify malignancy globally based on an MRI of the whole breast. Thus, the global context of the whole breast rather than individual lesions is taken into account. Our proposed approach does not rely on lesion segmentations, which renders the annotation of training data much more effective than in current object detection approaches. Achieving an AUROC of 0.89, we compare the performance of our approach to Mask R-CNN and Retina U-Net as well as a radiologist. Our performance is on par with approaches that, in contrast to our method, rely on pixelwise segmentations of lesions.