ﻻ يوجد ملخص باللغة العربية
Autoencoders are commonly trained using element-wise loss. However, element-wise loss disregards high-level structures in the image which can lead to embeddings that disregard them as well. A recent improvement to autoencoders that helps alleviate this problem is the use of perceptual loss. This work investigates perceptual loss from the perspective of encoder embeddings themselves. Autoencoders are trained to embed images from three different computer vision datasets using perceptual loss based on a pretrained model as well as pixel-wise loss. A host of different predictors are trained to perform object positioning and classification on the datasets given the embedded images as input. The two kinds of losses are evaluated by comparing how the predictors performed with embeddings from the differently trained autoencoders. The results show that, in the image domain, the embeddings generated by autoencoders trained with perceptual loss enable more accurate predictions than those trained with element-wise loss. Furthermore, the results show that, on the task of object positioning of a small-scale feature, perceptual loss can improve the results by a factor 10. The experimental setup is available online: https://github.com/guspih/Perceptual-Autoencoders
By benefiting from perceptual losses, recent studies have improved significantly the performance of the super-resolution task, where a high-resolution image is resolved from its low-resolution counterpart. Although such objective functions generate n
In this paper, we propose an image quality transformer (IQT) that successfully applies a transformer architecture to a perceptual full-reference image quality assessment (IQA) task. Perceptual representation becomes more important in image quality as
This paper proposes an explicit way to optimize the super-resolution network for generating visually pleasing images. The previous approaches use several loss functions which is hard to interpret and has the implicit relationships to improve the perc
The vast majority of photos taken today are by mobile phones. While their quality is rapidly growing, due to physical limitations and cost constraints, mobile phone cameras struggle to compare in quality with DSLR cameras. This motivates us to comput
We present a learning-based approach with pose perceptual loss for automatic music video generation. Our method can produce a realistic dance video that conforms to the beats and rhymes of almost any given music. To achieve this, we firstly generate