No Arabic abstract
Thermal Images profile the passive radiation of objects and capture them in grayscale images. Such images have a very different distribution of data compared to optical colored images. We present here a work that produces a grayscale thermo-optical fused mask given a thermal input. This is a deep learning based pioneering work since to the best of our knowledge, there exists no other work on thermal-optical grayscale fusion. Our method is also unique in the sense that the deep learning method we are proposing here works on the Discrete Wavelet Transform (DWT) domain instead of the gray level domain. As a part of this work, we also present a new and unique database for obtaining the region of interest in thermal images based on an existing thermal visual paired database, containing the Region of Interest on 5 different classes of data. Finally, we are proposing a simple low cost overhead statistical measure for identifying the region of interest in the fused images, which we call as the Region of Fusion (RoF). Experiments on the database show encouraging results in identifying the region of interest in the fused images. We also show that they can be processed better in the mixed form rather than with only thermal images.
Existing deep Thermal InfraRed (TIR) trackers only use semantic features to describe the TIR object, which lack the sufficient discriminative capacity for handling distractors. This becomes worse when the feature extraction network is only trained on RGB images.To address this issue, we propose a multi-level similarity model under a Siamese framework for robust TIR object tracking. Specifically, we compute different pattern similarities on two convolutional layers using the proposed multi-level similarity network. One of them focuses on the global semantic similarity and the other computes the local structural similarity of the TIR object. These two similarities complement each other and hence enhance the discriminative capacity of the network for handling distractors. In addition, we design a simple while effective relative entropy based ensemble subnetwork to integrate the semantic and structural similarities. This subnetwork can adaptive learn the weights of the semantic and structural similarities at the training stage. To further enhance the discriminative capacity of the tracker, we construct the first large scale TIR video sequence dataset for training the proposed model. The proposed TIR dataset not only benefits the training for TIR tracking but also can be applied to numerous TIR vision tasks. Extensive experimental results on the VOT-TIR2015 and VOT-TIR2017 benchmarks demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods.
Chromosome classification is an important but difficult and tedious task in karyotyping. Previous methods only classify manually segmented single chromosome, which is far from clinical practice. In this work, we propose a detection based method, DeepACC, to locate and fine classify chromosomes simultaneously based on the whole metaphase image. We firstly introduce the Additive Angular Margin Loss to enhance the discriminative power of model. To alleviate batch effects, we transform decision boundary of each class case-by-case through a siamese network which make full use of prior knowledges that chromosomes usually appear in pairs. Furthermore, we take the clinically seven group criterion as a prior knowledge and design an additional Group Inner-Adjacency Loss to further reduce inter-class similarities. 3390 metaphase images from clinical laboratory are collected and labelled to evaluate the performance. Results show that the new design brings encouraging performance gains comparing to the state-of-the-art baselines.
Early wildfire detection is of paramount importance to avoid as much damage as possible to the environment, properties, and lives. Deep Learning (DL) models that can leverage both visible and infrared information have the potential to display state-of-the-art performance, with lower false-positive rates than existing techniques. However, most DL-based image fusion methods have not been evaluated in the domain of fire imagery. Additionally, to the best of our knowledge, no publicly available dataset contains visible-infrared fused fire images. There is a growing interest in DL-based image fusion techniques due to their reduced complexity. Due to the latter, we select three state-of-the-art, DL-based image fusion techniques and evaluate them for the specific task of fire image fusion. We compare the performance of these methods on selected metrics. Finally, we also present an extension to one of the said methods, that we called FIRe-GAN, that improves the generation of artificial infrared images and fused ones on selected metrics.
In the present study, we propose a novel case-based similar image retrieval (SIR) method for hematoxylin and eosin (H&E)-stained histopathological images of malignant lymphoma. When a whole slide image (WSI) is used as an input query, it is desirable to be able to retrieve similar cases by focusing on image patches in pathologically important regions such as tumor cells. To address this problem, we employ attention-based multiple instance learning, which enables us to focus on tumor-specific regions when the similarity between cases is computed. Moreover, we employ contrastive distance metric learning to incorporate immunohistochemical (IHC) staining patterns as useful supervised information for defining appropriate similarity between heterogeneous malignant lymphoma cases. In the experiment with 249 malignant lymphoma patients, we confirmed that the proposed method exhibited higher evaluation measures than the baseline case-based SIR methods. Furthermore, the subjective evaluation by pathologists revealed that our similarity measure using IHC staining patterns is appropriate for representing the similarity of H&E-stained tissue images for malignant lymphoma.
Purpose: To develop a deep learning approach to de-noise optical coherence tomography (OCT) B-scans of the optic nerve head (ONH). Methods: Volume scans consisting of 97 horizontal B-scans were acquired through the center of the ONH using a commercial OCT device (Spectralis) for both eyes of 20 subjects. For each eye, single-frame (without signal averaging), and multi-frame (75x signal averaging) volume scans were obtained. A custom deep learning network was then designed and trained with 2,328 clean B-scans (multi-frame B-scans), and their corresponding noisy B-scans (clean B-scans + gaussian noise) to de-noise the single-frame B-scans. The performance of the de-noising algorithm was assessed qualitatively, and quantitatively on 1,552 B-scans using the signal to noise ratio (SNR), contrast to noise ratio (CNR), and mean structural similarity index metrics (MSSIM). Results: The proposed algorithm successfully denoised unseen single-frame OCT B-scans. The denoised B-scans were qualitatively similar to their corresponding multi-frame B-scans, with enhanced visibility of the ONH tissues. The mean SNR increased from $4.02 pm 0.68$ dB (single-frame) to $8.14 pm 1.03$ dB (denoised). For all the ONH tissues, the mean CNR increased from $3.50 pm 0.56$ (single-frame) to $7.63 pm 1.81$ (denoised). The MSSIM increased from $0.13 pm 0.02$ (single frame) to $0.65 pm 0.03$ (denoised) when compared with the corresponding multi-frame B-scans. Conclusions: Our deep learning algorithm can denoise a single-frame OCT B-scan of the ONH in under 20 ms, thus offering a framework to obtain superior quality OCT B-scans with reduced scanning times and minimal patient discomfort.