ترغب بنشر مسار تعليمي؟ اضغط هنا

Multi-Scale Recursive and Perception-Distortion Controllable Image Super-Resolution

307   0   0.0 ( 0 )
 نشر من قبل Pablo Navarrete Michelini
 تاريخ النشر 2018
والبحث باللغة English




اسأل ChatGPT حول البحث

We describe our solution for the PIRM Super-Resolution Challenge 2018 where we achieved the 2nd best perceptual quality for average RMSE<=16, 5th best for RMSE<=12.5, and 7th best for RMSE<=11.5. We modify a recently proposed Multi-Grid Back-Projection (MGBP) architecture to work as a generative system with an input parameter that can control the amount of artificial details in the output. We propose a discriminator for adversarial training with the following novel properties: it is multi-scale that resembles a progressive-GAN; it is recursive that balances the architecture of the generator; and it includes a new layer to capture significant statistics of natural images. Finally, we propose a training strategy that avoids conflicts between reconstruction and perceptual losses. Our configuration uses only 281k parameters and upscales each image of the competition in 0.2s in average.

قيم البحث

اقرأ أيضاً

Recently, the single image super-resolution (SISR) approaches with deep and complex convolutional neural network structures have achieved promising performance. However, those methods improve the performance at the cost of higher memory consumption, which is difficult to be applied for some mobile devices with limited storage and computing resources. To solve this problem, we present a lightweight multi-scale feature interaction network (MSFIN). For lightweight SISR, MSFIN expands the receptive field and adequately exploits the informative features of the low-resolution observed images from various scales and interactive connections. In addition, we design a lightweight recurrent residual channel attention block (RRCAB) so that the network can benefit from the channel attention mechanism while being sufficiently lightweight. Extensive experiments on some benchmarks have confirmed that our proposed MSFIN can achieve comparable performance against the state-of-the-arts with a more lightweight model.
We introduce a simple and efficient lossless image compression algorithm. We store a low resolution version of an image as raw pixels, followed by several iterations of lossless super-resolution. For lossless super-resolution, we predict the probabil ity of a high-resolution image, conditioned on the low-resolution input, and use entropy coding to compress this super-resolution operator. Super-Resolution based Compression (SReC) is able to achieve state-of-the-art compression rates with practical runtimes on large datasets. Code is available online at https://github.com/caoscott/SReC.
We present SR3, an approach to image Super-Resolution via Repeated Refinement. SR3 adapts denoising diffusion probabilistic models to conditional image generation and performs super-resolution through a stochastic denoising process. Inference starts with pure Gaussian noise and iteratively refines the noisy output using a U-Net model trained on denoising at various noise levels. SR3 exhibits strong performance on super-resolution tasks at different magnification factors, on faces and natural images. We conduct human evaluation on a standard 8X face super-resolution task on CelebA-HQ, comparing with SOTA GAN methods. SR3 achieves a fool rate close to 50%, suggesting photo-realistic outputs, while GANs do not exceed a fool rate of 34%. We further show the effectiveness of SR3 in cascaded image generation, where generative models are chained with super-resolution models, yielding a competitive FID score of 11.3 on ImageNet.
The presence of residual and dense neural networks which greatly promotes the development of image Super-Resolution(SR) have witnessed a lot of impressive results. Depending on our observation, although more layers and connections could always improv e performance, the increase of model parameters is not conducive to launch application of SR algorithms. Furthermore, algorithms supervised by L1/L2 loss can achieve considerable performance on traditional metrics such as PSNR and SSIM, yet resulting in blurry and over-smoothed outputs without sufficient high-frequency details, namely low perceptual index(PI). Regarding the issues, this paper develops a perception-oriented single image SR algorithm via dual relativistic average generative adversarial networks. In the generator part, a novel residual channel attention block is proposed to recalibrate significance of specific channels, further increasing feature expression capabilities. Parameters of convolutional layers within each block are shared to expand receptive fields while maintain the amount of tunable parameters unchanged. The feature maps are subsampled using sub-pixel convolution to obtain reconstructed high-resolution images. The discriminator part consists of two relativistic average discriminators that work in pixel domain and feature domain, respectively, fully exploiting the prior that half of data in a mini-batch are fake. Different weighted combinations of perceptual loss and adversarial loss are utilized to supervise the generator to equilibrate perceptual quality and objective results. Experimental results and ablation studies show that our proposed algorithm can rival state-of-the-art SR algorithms, both perceptually(PI-minimization) and objectively(PSNR-maximization) with fewer parameters.
Deep Convolutional Neural Networks (CNN) have drawn great attention in image super-resolution (SR). Recently, visual attention mechanism, which exploits both of the feature importance and contextual cues, has been introduced to image SR and proves to be effective to improve CNN-based SR performance. In this paper, we make a thorough investigation on the attention mechanisms in a SR model and shed light on how simple and effective improvements on these ideas improve the state-of-the-arts. We further propose a unified approach called multi-grained attention networks (MGAN) which fully exploits the advantages of multi-scale and attention mechanisms in SR tasks. In our method, the importance of each neuron is computed according to its surrounding regions in a multi-grained fashion and then is used to adaptively re-scale the feature responses. More importantly, the channel attention and spatial attention strategies in previous methods can be essentially considered as two special cases of our method. We also introduce multi-scale dense connections to extract the image features at multiple scales and capture the features of different layers through dense skip connections. Ablation studies on benchmark datasets demonstrate the effectiveness of our method. In comparison with other state-of-the-art SR methods, our method shows the superiority in terms of both accuracy and model size.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا