Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

SWAGAN: A Style-based Wavelet-driven Generative Model

93 0 0.0 ( 0 )

Download Cite

Added by Rinon Gal

Publication date 2021

fields Informatics Engineering Electronic Engineering

and research's language is English

Authors Rinon Gal - Dana Cohen - Amit Bermano

Computer Vision and Pattern Recognition Image and Video Processing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In recent years, considerable progress has been made in the visual quality of Generative Adversarial Networks (GANs). Even so, these networks still suffer from degradation in quality for high-frequency content, stemming from a spectrally biased architecture, and similarly unfavorable loss functions. To address this issue, we present a novel general-purpose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain. SWAGAN incorporates wavelets throughout its generator and discriminator architectures, enforcing a frequency-aware latent representation at every step of the way. This approach yields enhancements in the visual quality of the generated images, and considerably increases computational performance. We demonstrate the advantage of our method by integrating it into the SyleGAN2 framework, and verifying that content generation in the wavelet domain leads to higher quality images with more realistic high-frequency content. Furthermore, we verify that our models latent space retains the qualities that allow StyleGAN to serve as a basis for a multitude of editing tasks, and show that our frequency-aware approach also induces improved downstream visual quality.

rate research

DRB-GAN: A Dynamic ResBlock Generative Adversarial Network for Artistic Style Transfer

223 - Wenju Xu , Chengjiang Long , Ruisheng Wang 2021

The paper proposes a Dynamic ResBlock Generative Adversarial Network (DRB-GAN) for artistic style transfer. The style code is modeled as the shared parameters for Dynamic ResBlocks connecting both the style encoding network and the style transfer network. In the style encoding network, a style class-aware attention mechanism is used to attend the style feature representation for generating the style codes. In the style transfer network, multiple Dynamic ResBlocks are designed to integrate the style code and the extracted CNN semantic feature and then feed into the spatial window Layer-Instance Normalization (SW-LIN) decoder, which enables high-quality synthetic images with artistic style transfer. Moreover, the style collection conditional discriminator is designed to equip our DRB-GAN model with abilities for both arbitrary style transfer and collection style transfer during the training stage. No matter for arbitrary style transfer or collection style transfer, extensive experiments strongly demonstrate that our proposed DRB-GAN outperforms state-of-the-art methods and exhibits its superior performance in terms of visual quality and efficiency. Our source code is available at color{magenta}{url{https://github.com/xuwenju123/DRB-GAN}}.

Computer Vision and Pattern Recognition Image and Video Processing

WaveFill: A Wavelet-based Generation Network for Image Inpainting

414 - Yingchen Yu , Fangneng Zhan , Shijian Lu 2021

Image inpainting aims to complete the missing or corrupted regions of images with realistic contents. The prevalent approaches adopt a hybrid objective of reconstruction and perceptual quality by using generative adversarial networks. However, the reconstruction loss and adversarial loss focus on synthesizing contents of different frequencies and simply applying them together often leads to inter-frequency conflicts and compromised inpainting. This paper presents WaveFill, a wavelet-based inpainting network that decomposes images into multiple frequency bands and fills the missing regions in each frequency band separately and explicitly. WaveFill decomposes images by using discrete wavelet transform (DWT) that preserves spatial information naturally. It applies L1 reconstruction loss to the decomposed low-frequency bands and adversarial loss to high-frequency bands, hence effectively mitigate inter-frequency conflicts while completing images in spatial domain. To address the inpainting inconsistency in different frequency bands and fuse features with distinct statistics, we design a novel normalization scheme that aligns and fuses the multi-frequency features effectively. Extensive experiments over multiple datasets show that WaveFill achieves superior image inpainting qualitatively and quantitatively.

Computer Vision and Pattern Recognition Image and Video Processing

Remote Sensing Image Translation via Style-Based Recalibration Module and Improved Style Discriminator

96 - Tiange Zhang , Feng Gao , Junyu Dong 2021

Existing remote sensing change detection methods are heavily affected by seasonal variation. Since vegetation colors are different between winter and summer, such variations are inclined to be falsely detected as changes. In this letter, we proposed an image translation method to solve the problem. A style-based recalibration module is introduced to capture seasonal features effectively. Then, a new style discriminator is designed to improve the translation performance. The discriminator can not only produce a decision for the fake or real sample, but also return a style vector according to the channel-wise correlations. Extensive experiments are conducted on season-varying dataset. The experimental results show that the proposed method can effectively perform image translation, thereby consistently improving the season-varying image change detection performance. Our codes and data are available at https://github.com/summitgao/RSIT_SRM_ISD.

Computer Vision and Pattern Recognition Image and Video Processing

Generative-Adversarial-Networks-based Ghost Recognition

122 - Yuchen He , Yibing Chen , Sheng Luo 2021

Nowadays, target recognition technique plays an important role in many fields. However, the current target image information based methods suffer from the influence of image quality and the time cost of image reconstruction. In this paper, we propose a novel imaging-free target recognition method combining ghost imaging (GI) and generative adversarial networks (GAN). Based on the mechanism of GI, a set of random speckles sequence is employed to illuminate target, and a bucket detector without resolution is utilized to receive echo signal. The bucket signal sequence formed after continuous detections is constructed into a bucket signal array, which is regarded as the sample of GAN. Then, conditional GAN is used to map bucket signal array and target category. In practical application, the speckles sequence in training step is employed to illuminate target, and the bucket signal array is input GAN for recognition. The proposed method can improve the problems caused by conventional recognition methods that based on target image information, and provide a certain turbulence-free ability. Extensive experiments show that the proposed method achieves promising performance.

Computer Vision and Pattern Recognition Image and Video Processing

A Structurally Regularized Convolutional Neural Network for Image Classification using Wavelet-based SubBand Decomposition

327 - Pavel Sinha , Ioannis Psaromiligkos , Zeljko Zilic 2021

We propose a convolutional neural network (CNN) architecture for image classification based on subband decomposition of the image using wavelets. The proposed architecture decomposes the input image spectra into multiple critically sampled subbands, extracts features using a single CNN per subband, and finally, performs classification by combining the extracted features using a fully connected layer. Processing each of the subbands by an individual CNN, thereby limiting the learning scope of each CNN to a single subband, imposes a form of structural regularization. This provides better generalization capability as seen by the presented results. The proposed architecture achieves best-in-class performance in terms of total multiply-add-accumulator operations and nearly best-in-class performance in terms of total parameters required, yet it maintains competitive classification performance. We also show the proposed architecture is more robust than the regular full-band CNN to noise caused by weight-and-bias quantization and input quantization.

Computer Vision and Pattern Recognition Image and Video Processing

comments

Fetching comments

Hama University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

SWAGAN: A Style-based Wavelet-driven Generative Model

Ask ChatGPT about the research

No Arabic abstract

Read More