No Arabic abstract
Stereoscopic video technologies have been introduced to the consumer market in the past few years. A key factor in designing a 3D system is to understand how different visual cues and distortions affect the perceptual quality of stereoscopic video. The ultimate way to assess 3D video quality is through subjective tests. However, subjective evaluation is time consuming, expensive, and in some cases not possible. The other solution is developing objective quality metrics, which attempt to model the Human Visual System (HVS) in order to assess perceptual quality. Although several 2D quality metrics have been proposed for still images and videos, in the case of 3D efforts are only at the initial stages. In this paper, we propose a new full-reference quality metric for 3D content. Our method mimics HVS by fusing information of both the left and right views to construct the cyclopean view, as well as taking to account the sensitivity of HVS to contrast and the disparity of the views. In addition, a temporal pooling strategy is utilized to address the effect of temporal variations of the quality in the video. Performance evaluations showed that our 3D quality metric quantifies quality degradation caused by several representative types of distortions very accurately, with Pearson correlation coefficient of 90.8 %, a competitive performance compared to the state-of-the-art 3D quality metrics.
Visual Attention Models (VAMs) predict the location of an image or video regions that are most likely to attract human attention. Although saliency detection is well explored for 2D image and video content, there are only few attempts made to design 3D saliency prediction models. Newly proposed 3D visual attention models have to be validated over large-scale video saliency prediction datasets, which also contain results of eye-tracking information. There are several publicly available eye-tracking datasets for 2D image and video content. In the case of 3D, however, there is still a need for large-scale video saliency datasets for the research community for validating different 3D-VAMs. In this paper, we introduce a large-scale dataset containing eye-tracking data collected from 61 stereoscopic 3D videos (and also 2
Increasing the frame rate of a 3D video generally results in improved Quality of Experience (QoE). However, higher frame rates involve a higher degree of complexity in capturing, transmission, storage, and display. The question that arises here is what frame rate guarantees high viewing quality of experience given the existing/required 3D devices and technologies (3D cameras, 3D TVs, compression, transmission bandwidth, and storage capacity). This question has already been addressed for the case of 2D video, but not for 3D. The objective of this paper is to study the relationship between 3D quality and bitrate at different frame rates. Our performance evaluations show that increasing the frame rate of 3D videos beyond 60 fps may not be visually distinguishable. In addition, our experiments show that when the available bandwidth is reduced, the highest possible 3D quality of experience can be achieved by adjusting (decreasing) the frame rate instead of increasing the compression ratio. The results of our study are of particular interest to network providers for rate adaptation in variable bitrate channels.
A key factor in designing 3D systems is to understand how different visual cues and distortions affect the perceptual quality of 3D video. The ultimate way to assess video quality is through subjective tests. However, subjective evaluation is time consuming, expensive, and in most cases not even possible. An alternative solution is objective quality metrics, which attempt to model the Human Visual System (HVS) in order to assess the perceptual quality. The potential of 3D technology to significantly improve the immersiveness of video content has been hampered by the difficulty of objectively assessing Quality of Experience (QoE). A no-reference (NR) objective 3D quality metric, which could help determine capturing parameters and improve playback perceptual quality, would be welcomed by camera and display manufactures. Network providers would embrace a full-reference (FR) 3D quality metric, as they could use it to ensure efficient QoE-based resource management during compression and Quality of Service (QoS) during transmission.
The emergence of multiview displays has made the need for synthesizing virtual views more pronounced, since it is not practical to capture all of the possible views when filming multiview content. View synthesis is performed using the available views and depth maps. There is a correlation between the quality of the synthesized views and the quality of depth maps. In this paper we study the effect of depth map quality on perceptual quality of synthesized view through subjective and objective analysis. Our evaluation results show that: 1) 3D video quality depends highly on the depth map quality and 2) the Visual Information Fidelity index computed between the reference and distorted depth maps has Pearson correlation ratio of 0.75 and Spearman rank order correlation coefficient of 0.67 with the subjective 3D video quality.
This paper describes a quality assessment model for perceptual video compression applications (PVM), which stimulates visual masking and distortion-artefact perception using an adaptive combination of noticeable distortions and blurring artefacts. The method shows significant improvement over existing quality metrics based on the VQEG database, and provides compatibility with in-loop rate-quality optimisation for next generation video codecs due to its latency and complexity attributes. Performance comparison are validated against a range of different distortion types.