No Arabic abstract
In this work, we aim to learn an unpaired image enhancement model, which can enrich low-quality images with the characteristics of high-quality images provided by users. We propose a quality attention generative adversarial network (QAGAN) trained on unpaired data based on the bidirectional Generative Adversarial Network (GAN) embedded with a quality attention module (QAM). The key novelty of the proposed QAGAN lies in the injected QAM for the generator such that it learns domain-relevant quality attention directly from the two domains. More specifically, the proposed QAM allows the generator to effectively select semantic-related characteristics from the spatial-wise and adaptively incorporate style-related attributes from the channel-wise, respectively. Therefore, in our proposed QAGAN, not only discriminators but also the generator can directly access both domains which significantly facilitates the generator to learn the mapping function. Extensive experimental results show that, compared with the state-of-the-art methods based on unpaired learning, our proposed method achieves better performance in both objective and subjective evaluations.
The paper proposes a method to effectively fuse multi-exposure inputs and generates high-quality high dynamic range (HDR) images with unpaired datasets. Deep learning-based HDR image generation methods rely heavily on paired datasets. The ground truth provides information for the network getting HDR images without ghosting. Datasets without ground truth are hard to apply to train deep neural networks. Recently, Generative Adversarial Networks (GAN) have demonstrated their potentials of translating images from source domain X to target domain Y in the absence of paired examples. In this paper, we propose a GAN-based network for solving such problems while generating enjoyable HDR results, named UPHDR-GAN. The proposed method relaxes the constraint of paired dataset and learns the mapping from LDR domain to HDR domain. Although the pair data are missing, UPHDR-GAN can properly handle the ghosting artifacts caused by moving objects or misalignments with the help of modified GAN loss, improved discriminator network and useful initialization phase. The proposed method preserves the details of important regions and improves the total image perceptual quality. Qualitative and quantitative comparisons against other methods demonstrated the superiority of our method.
During training phase, more connections (e.g. channel concatenation in last layer of DenseNet) means more occupied GPU memory and lower GPU utilization, requiring more training time. The increase of training time is also not conducive to launch application of SR algorithms. Thiss why we abandoned DenseNet as basic network. Futhermore, we abandoned this paper due to its limitation only applied on medical images. Please view our lastest work applied on general images at arXiv:1911.03464.
The captured images under low light conditions often suffer insufficient brightness and notorious noise. Hence, low-light image enhancement is a key challenging task in computer vision. A variety of methods have been proposed for this task, but these methods often failed in an extreme low-light environment and amplified the underlying noise in the input image. To address such a difficult problem, this paper presents a novel attention-based neural network to generate high-quality enhanced low-light images from the raw sensor data. Specifically, we first employ attention strategy (i.e. channel attention and spatial attention modules) to suppress undesired chromatic aberration and noise. The channel attention module guides the network to refine redundant colour features. The spatial attention module focuses on denoising by taking advantage of the non-local correlation in the image. Furthermore, we propose a new pooling layer, called inverted shuffle layer, which adaptively selects useful information from previous features. Extensive experiments demonstrate the superiority of the proposed network in terms of suppressing the chromatic aberration and noise artifacts in enhancement, especially when the low-light image has severe noise.
Magnetic resonance imaging (MRI) is an important medical imaging modality, but its acquisition speed is quite slow due to the physiological limitations. Recently, super-resolution methods have shown excellent performance in accelerating MRI. In some circumstances, it is difficult to obtain high-resolution images even with prolonged scan time. Therefore, we proposed a novel super-resolution method that uses a generative adversarial network (GAN) with cyclic loss and attention mechanism to generate high-resolution MR images from low-resolution MR images by a factor of 2. We implemented our model on pelvic images from healthy subjects as training and validation data, while those data from patients were used for testing. The MR dataset was obtained using different imaging sequences, including T2, T2W SPAIR, and mDIXON-W. Four methods, i.e., BICUBIC, SRCNN, SRGAN, and EDSR were used for comparison. Structural similarity, peak signal to noise ratio, root mean square error, and variance inflation factor were used as calculation indicators to evaluate the performances of the proposed method. Various experimental results showed that our method can better restore the details of the high-resolution MR image as compared to the other methods. In addition, the reconstructed high-resolution MR image can provide better lesion textures in the tumor patients, which is promising to be used in clinical diagnosis.
Intelligent vision is appealing in computer-assisted and robotic surgeries. Vision-based analysis with deep learning usually requires large labeled datasets, but manual data labeling is expensive and time-consuming in medical problems. We investigate a novel cross-domain strategy to reduce the need for manual data labeling by proposing an image-to-image translation model live-cadaver GAN (LC-GAN) based on generative adversarial networks (GANs). We consider a situation when a labeled cadaveric surgery dataset is available while the task is instrument segmentation on an unlabeled live surgery dataset. We train LC-GAN to learn the mappings between the cadaveric and live images. For live image segmentation, we first translate the live images to fake-cadaveric images with LC-GAN and then perform segmentation on the fake-cadaveric images with models trained on the real cadaveric dataset. The proposed method fully makes use of the labeled cadaveric dataset for live image segmentation without the need to label the live dataset. LC-GAN has two generators with different architectures that leverage the deep feature representation learned from the cadaveric image based segmentation task. Moreover, we propose the structural similarity loss and segmentation consistency loss to improve the semantic consistency during translation. Our model achieves better image-to-image translation and leads to improved segmentation performance in the proposed cross-domain segmentation task.