Do you want to publish a course? Click here

No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

163   0   0.0 ( 0 )
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

The goal of No-Reference Image Quality Assessment (NR-IQA) is to estimate the perceptual image quality in accordance with subjective evaluations, it is a complex and unsolved problem due to the absence of the pristine reference image. In this paper, we propose a novel model to address the NR-IQA task by leveraging a hybrid approach that benefits from Convolutional Neural Networks (CNNs) and self-attention mechanism in Transformers to extract both local and non-local features from the input image. We capture local structure information of the image via CNNs, then to circumvent the locality bias among the extracted CNNs features and obtain a non-local representation of the image, we utilize Transformers on the extracted features where we model them as a sequential input to the Transformer model. Furthermore, to improve the monotonicity correlation between the subjective and objective scores, we utilize the relative distance information among the images within each batch and enforce the relative ranking among them. Last but not least, we observe that the performance of NR-IQA models degrades when we apply equivariant transformations (e.g. horizontal flipping) to the inputs. Therefore, we propose a method that leverages self-consistency as a source of self-supervision to improve the robustness of NRIQA models. Specifically, we enforce self-consistency between the outputs of our quality assessment model for each image and its transformation (horizontally flipped) to utilize the rich self-supervisory information and reduce the uncertainty of the model. To demonstrate the effectiveness of our work, we evaluate it on seven standard IQA datasets (both synthetic and authentic) and show that our model achieves state-of-the-art results on various datasets.



rate research

Read More

No-reference image quality assessment (NR-IQA) has received increasing attention in the IQA community since reference image is not always available. Real-world images generally suffer from various types of distortion. Unfortunately, existing NR-IQA methods do not work with all types of distortion. It is a challenging task to develop universal NR-IQA that has the ability of evaluating all types of distorted images. In this paper, we propose a universal NR-IQA method based on controllable list-wise ranking (CLRIQA). First, to extend the authentically distorted image dataset, we present an imaging-heuristic approach, in which the over-underexposure is formulated as an inverse of Weber-Fechner law, and fusion strategy and probabilistic compression are adopted, to generate the degraded real-world images. These degraded images are label-free yet associated with quality ranking information. We then design a controllable list-wise ranking function by limiting rank range and introducing an adaptive margin to tune rank interval. Finally, the extended dataset and controllable list-wise ranking function are used to pre-train a CNN. Moreover, in order to obtain an accurate prediction model, we take advantage of the original dataset to further fine-tune the pre-trained network. Experiments evaluated on four benchmark datasets (i.e. LIVE, CSIQ, TID2013, and LIVE-C) show that the proposed CLRIQA improves the state of the art by over 9% in terms of overall performance. The code and model are publicly available at https://github.com/GZHU-Image-Lab/CLRIQA.
In this paper, we propose a no-reference (NR) image quality assessment (IQA) method via feature level pseudo-reference (PR) hallucination. The proposed quality assessment framework is grounded on the prior models of natural image statistical behaviors and rooted in the view that the perceptually meaningful features could be well exploited to characterize the visual quality. Herein, the PR features from the distorted images are learned by a mutual learning scheme with the pristine reference as the supervision, and the discriminative characteristics of PR features are further ensured with the triplet constraints. Given a distorted image for quality inference, the feature level disentanglement is performed with an invertible neural layer for final quality prediction, leading to the PR and the corresponding distortion features for comparison. The effectiveness of our proposed method is demonstrated on four popular IQA databases, and superior performance on cross-database evaluation also reveals the high generalization capability of our method. The implementation of our method is publicly available on https://github.com/Baoliang93/FPR.
The process of rendering high dynamic range (HDR) images to be viewed on conventional displays is called tone mapping. However, tone mapping introduces distortions in the final image which may lead to visual displeasure. To quantify these distortions, we introduce a novel no-reference quality assessment technique for these tone mapped images. This technique is composed of two stages. In the first stage, we employ a convolutional neural network (CNN) to generate quality aware maps (also known as distortion maps) from tone mapped images by training it with the ground truth distortion maps. In the second stage, we model the normalized image and distortion maps using an Asymmetric Generalized Gaussian Distribution (AGGD). The parameters of the AGGD model are then used to estimate the quality score using support vector regression (SVR). We show that the proposed technique delivers competitive performance relative to the state-of-the-art techniques. The novelty of this work is its ability to visualize various distortions as quality maps (distortion maps), especially in the no-reference setting, and to use these maps as features to estimate the quality score of tone mapped images.
To guarantee a satisfying Quality of Experience (QoE) for consumers, it is required to measure image quality efficiently and reliably. The neglect of the high-level semantic information may result in predicting a clear blue sky as bad quality, which is inconsistent with human perception. Therefore, in this paper, we tackle this problem by exploiting the high-level semantics and propose a novel no-reference image quality assessment method for realistic blur images. Firstly, the whole image is divided into multiple overlapping patches. Secondly, each patch is represented by the high-level feature extracted from the pre-trained deep convolutional neural network model. Thirdly, three different kinds of statistical structures are adopted to aggregate the information from different patches, which mainly contain some common statistics (i.e., the mean&standard deviation, quantiles and moments). Finally, the aggregated features are fed into a linear regression model to predict the image quality. Experiments show that, compared with low-level features, high-level features indeed play a more critical role in resolving the aforementioned challenging problem for quality estimation. Besides, the proposed method significantly outperforms the state-of-the-art methods on two realistic blur image databases and achieves comparable performance on two synthetic blur image databases.
81 - Wei Sun , Tao Wang , Xiongkuo Min 2021
In this paper, we propose a deep learning based video quality assessment (VQA) framework to evaluate the quality of the compressed users generated content (UGC) videos. The proposed VQA framework consists of three modules, the feature extraction module, the quality regression module, and the quality pooling module. For the feature extraction module, we fuse the features from intermediate layers of the convolutional neural network (CNN) network into final quality-aware feature representation, which enables the model to make full use of visual information from low-level to high-level. Specifically, the structure and texture similarities of feature maps extracted from all intermediate layers are calculated as the feature representation for the full reference (FR) VQA model, and the global mean and standard deviation of the final feature maps fused by intermediate feature maps are calculated as the feature representation for the no reference (NR) VQA model. For the quality regression module, we use the fully connected (FC) layer to regress the quality-aware features into frame-level scores. Finally, a subjectively-inspired temporal pooling strategy is adopted to pool frame-level scores into the video-level score. The proposed model achieves the best performance among the state-of-the-art FR and NR VQA models on the Compressed UGC VQA database and also achieves pretty good performance on the in-the-wild UGC VQA databases.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا