DW-GAN: A Discrete Wavelet Transform GAN for NonHomogeneous Dehazing

69 0 0.0 ( 0 )

Download Cite

Added by Huan Liu

Publication date 2021

fields Electronic Engineering

and research's language is English

Authors Minghan Fu - Huan Liu - Yankun Yu

Image and Video Processing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Hazy images are often subject to color distortion, blurring, and other visible quality degradation. Some existing CNN-based methods have great performance on removing homogeneous haze, but they are not robust in non-homogeneous case. The reasons are mainly in two folds. Firstly, due to the complicated haze distribution, texture details are easy to be lost during the dehazing process. Secondly, since the training pairs are hard to be collected, training on limited data can easily lead to over-fitting problem. To tackle these two issues, we introduce a novel dehazing network using 2D discrete wavelet transform, namely DW-GAN. Specifically, we propose a two-branch network to deal with the aforementioned problems. By utilizing wavelet transform in DWT branch, our proposed method can retain more high-frequency knowledge in feature maps. In order to prevent over-fitting, ImageNet pre-trained Res2Net is adopted in the knowledge adaptation branch. Owing to the robust feature representations of ImageNet pre-training, the generalization ability of our network is improved dramatically. Finally, a patch-based discriminator is used to reduce artifacts of the restored images. Extensive experimental results demonstrate that the proposed method outperforms the state-of-the-arts quantitatively and qualitatively.

rate research

TWIST-GAN: Towards Wavelet Transform and Transferred GAN for Spatio-Temporal Single Image Super Resolution

145 - Fayaz Ali Dharejo , Farah Deeba , Yuanchun Zhou 2021

Single Image Super-resolution (SISR) produces high-resolution images with fine spatial resolutions from aremotely sensed image with low spatial resolution. Recently, deep learning and generative adversarial networks(GANs) have made breakthroughs for the challenging task of single image super-resolution (SISR). However, thegenerated image still suffers from undesirable artifacts such as, the absence of texture-feature representationand high-frequency information. We propose a frequency domain-based spatio-temporal remote sensingsingle image super-resolution technique to reconstruct the HR image combined with generative adversarialnetworks (GANs) on various frequency bands (TWIST-GAN). We have introduced a new method incorporatingWavelet Transform (WT) characteristics and transferred generative adversarial network. The LR image hasbeen split into various frequency bands by using the WT, whereas, the transfer generative adversarial networkpredicts high-frequency components via a proposed architecture. Finally, the inverse transfer of waveletsproduces a reconstructed image with super-resolution. The model is first trained on an external DIV2 Kdataset and validated with the UC Merceed Landsat remote sensing dataset and Set14 with each image sizeof 256x256. Following that, transferred GANs are used to process spatio-temporal remote sensing images inorder to minimize computation cost differences and improve texture information. The findings are comparedqualitatively and qualitatively with the current state-of-art approaches. In addition, we saved about 43% of theGPU memory during training and accelerated the execution of our simplified version by eliminating batchnormalization layers.

Image and Video Processing Computer Vision and Pattern Recognition

UWGAN: Underwater GAN for Real-world Underwater Color Restoration and Dehazing

62 - Nan Wang , Yabin Zhou , Fenglei Han 2019

In real-world underwater environment, exploration of seabed resources, underwater archaeology, and underwater fishing rely on a variety of sensors, vision sensor is the most important one due to its high information content, non-intrusive, and passive nature. However, wavelength-dependent light attenuation and back-scattering result in color distortion and haze effect, which degrade the visibility of images. To address this problem, firstly, we proposed an unsupervised generative adversarial network (GAN) for generating realistic underwater images (color distortion and haze effect) from in-air image and depth map pairs based on improved underwater imaging model. Secondly, U-Net, which is trained efficiently using synthetic underwater dataset, is adopted for color restoration and dehazing. Our model directly reconstructs underwater clear images using end-to-end autoencoder networks, while maintaining scene content structural similarity. The results obtained by our method were compared with existing methods qualitatively and quantitatively. Experimental results obtained by the proposed model demonstrate well performance on open real-world underwater datasets, and the processing speed can reach up to 125FPS running on one NVIDIA 1060 GPU. Source code, sample datasets are made publicly available at https://github.com/infrontofme/UWGAN_UIE.

Image and Video Processing Computer Vision and Pattern Recognition

Efficient Re-parameterization Residual Attention Network For Nonhomogeneous Image Dehazing

170 - Tian Ye , ErKang Chen , XinRui Huang 2021

This paper proposes an end-to-end Efficient Re-parameterizationResidual Attention Network(ERRA-Net) to directly restore the nonhomogeneous hazy image. The contribution of this paper mainly has the following three aspects: 1) A novel Multi-branch Attention (MA) block. The spatial attention mechanism better reconstructs high-frequency features, and the channel attention mechanism treats the features of different channels differently. Multi-branch structure dramatically improves the representation ability of the model and can be changed into a single path structure after re-parameterization to speed up the process of inference. Local Residual Connection allows the low-frequency information in the nonhomogeneous area to pass through the block without processing so that the block can focus on detailed features. 2) A lightweight network structure. We use cascaded MA blocks to extract high-frequency features step by step, and the Multi-layer attention fusion tail combines the shallow and deep features of the model to get the residual of the clean image finally. 3)We propose two novel loss functions to help reconstruct the hazy image ColorAttenuation loss and Laplace Pyramid loss. ERRA-Net has an impressive speed, processing 1200x1600 HD quality images with an average runtime of 166.11 fps. Extensive evaluations demonstrate that ERSANet performs favorably against the SOTA approaches on the real-world hazy images.

Image and Video Processing Computer Vision and Pattern Recognition

TR-GAN: Topology Ranking GAN with Triplet Loss for Retinal Artery/Vein Classification

314 - Wenting Chen , Shuang Yu , Junde Wu 2020

Retinal artery/vein (A/V) classification lays the foundation for the quantitative analysis of retinal vessels, which is associated with potential risks of various cardiovascular and cerebral diseases. The topological connection relationship, which has been proved effective in improving the A/V classification performance for the conventional graph based method, has not been exploited by the deep learning based method. In this paper, we propose a Topology Ranking Generative Adversarial Network (TR-GAN) to improve the topology connectivity of the segmented arteries and veins, and further to boost the A/V classification performance. A topology ranking discriminator based on ordinal regression is proposed to rank the topological connectivity level of the ground-truth, the generated A/V mask and the intentionally shuffled mask. The ranking loss is further back-propagated to the generator to generate better connected A/V masks. In addition, a topology preserving module with triplet loss is also proposed to extract the high-level topological features and further to narrow the feature distance between the predicted A/V mask and the ground-truth. The proposed framework effectively increases the topological connectivity of the predicted A/V masks and achieves state-of-the-art A/V classification performance on the publicly available AV-DRIVE dataset.

Image and Video Processing Computer Vision and Pattern Recognition

NTIRE 2020 Challenge on NonHomogeneous Dehazing

84 - Codruta O. Ancuti , Cosmin Ancuti , Florin-Alexandru Vasluianu 2020

This paper reviews the NTIRE 2020 Challenge on NonHomogeneous Dehazing of images (restoration of rich details in hazy image). We focus on the proposed solutions and their results evaluated on NH-Haze, a novel dataset consisting of 55 pairs of real haze free and nonhomogeneous hazy images recorded outdoor. NH-Haze is the first realistic nonhomogeneous haze dataset that provides ground truth images. The nonhomogeneous haze has been produced using a professional haze generator that imitates the real conditions of haze scenes. 168 participants registered in the challenge and 27 teams competed in the final testing phase. The proposed solutions gauge the state-of-the-art in image dehazing.

Computer Vision and Pattern Recognition