ﻻ يوجد ملخص باللغة العربية
Recently, intermediate feature maps of pre-trained convolutional neural networks have shown significant perceptual quality improvements, when they are used in the loss function for training new networks. It is believed that these features are better at encoding the perceptual quality and provide more efficient representations of input images compared to other perceptual metrics such as SSIM and PSNR. However, there have been no systematic studies to determine the underlying reason. Due to the lack of such an analysis, it is not possible to evaluate the performance of a particular set of features or to improve the perceptual quality even more by carefully selecting a subset of features from a pre-trained CNN. This work shows that the capabilities of pre-trained deep CNN features in optimizing the perceptual quality are correlated with their success in capturing basic human visual perception characteristics. In particular, we focus our analysis on fundamental aspects of human perception, such as the contrast sensitivity and orientation selectivity. We introduce two new formulations to measure the frequency and orientation selectivity of the features learned by convolutional layers for evaluating deep features learned by widely-used deep CNNs such as VGG-16. We demonstrate that the pre-trained CNN features which receive higher scores are better at predicting human quality judgment. Furthermore, we show the possibility of using our method to select deep features to form a new loss function, which improves the image reconstruction quality for the well-known single-image super-resolution problem.
While stochastic gradient descent (SGD) is still the emph{de facto} algorithm in deep learning, adaptive methods like Clipped SGD/Adam have been observed to outperform SGD across important tasks, such as attention models. The settings under which SGD
In this paper, we propose an image quality transformer (IQT) that successfully applies a transformer architecture to a perceptual full-reference image quality assessment (IQA) task. Perceptual representation becomes more important in image quality as
Tractable models of human perception have proved to be challenging to build. Hand-designed models such as MS-SSIM remain popular predictors of human image quality judgements due to their simplicity and speed. Recent modern deep learning approaches ca
We propose a new framework for reasoning about generalization in deep learning. The core idea is to couple the Real World, where optimizers take stochastic gradient steps on the empirical loss, to an Ideal World, where optimizers take steps on the po
RGBD images, combining high-resolution color and lower-resolution depth from various types of depth sensors, are increasingly common. One can significantly improve the resolution of depth maps by taking advantage of color information; deep learning m