No Arabic abstract
The panoramic video is widely used to build virtual reality (VR) and is expected to be one of the next generation Killer-Apps. Transmitting panoramic VR videos is a challenging task because of two problems: 1) panoramic VR videos are typically much larger than normal videos but they need to be transmitted with limited bandwidth in mobile networks. 2) high-resolution and fluent views should be provided to guarantee a superior user experience and avoid side-effects such as dizziness and nausea. To address these two problems, we propose a novel interactive streaming technology, namely Focus-based Interactive Streaming Framework (FISF). FISF consists of three parts: 1) we use the classic clustering algorithm DBSCAN to analyze real user data for Video Focus Detection (VFD); 2) we propose a Focus-based Interactive Streaming Technology (FIST), including a static version and a dynamic version; 3) we propose two optimization methods: focus merging and prefetch strategy. Experimental results show that FISF significantly outperforms the state-of-the-art. The paper is submitted to Sigcomm 2017, VR/AR Network on 31 Mar 2017 at 10:44:04am EDT.
The recent rise of interest in Virtual Reality (VR) came with the availability of commodity commercial VR prod- ucts, such as the Head Mounted Displays (HMD) created by Oculus and other vendors. To accelerate the user adoption of VR headsets, content providers should focus on producing high quality immersive content for these devices. Similarly, multimedia streaming service providers should enable the means to stream 360 VR content on their platforms. In this study, we try to cover different aspects related to VR content representation, streaming, and quality assessment that will help establishing the basic knowledge of how to build a VR streaming system.
Providing a depth-rich Virtual Reality (VR) experience to users without causing discomfort remains to be a challenge with todays commercially available head-mounted displays (HMDs), which enforce strict measures on stereoscopic camera parameters for the sake of keeping visual discomfort to a minimum. However, these measures often lead to an unimpressive VR experience with shallow depth feeling. We propose the first method ready to be used with existing consumer HMDs for automated stereoscopic camera control in virtual environments (VEs). Using radial basis function interpolation and projection matrix manipulations, our method makes it possible to significantly enhance user experience in terms of overall perceived depth while maintaining visual discomfort on a par with the default arrangement. In our implementation, we also introduce the first immersive interface for authoring a unique 3D stereoscopic cinematography for any VE to be experienced with consumer HMDs. We conducted a user study that demonstrates the benefits of our approach in terms of superior picture quality and perceived depth. We also investigated the effects of using depth of field (DoF) in combination with our approach and observed that the addition of our DoF implementation was seen as a degraded experience, if not similar.
Virtual Reality (VR) enables users to collaborate while exploring scenarios not realizable in the physical world. We propose CollabVR, a distributed multi-user collaboration environment, to explore how digital content improves expression and understanding of ideas among groups. To achieve this, we designed and examined three possible configurations for participants and shared manipulable objects. In configuration (1), participants stand side-by-side. In (2), participants are positioned across from each other, mirrored face-to-face. In (3), called eyes-free, participants stand side-by-side looking at a shared display, and draw upon a horizontal surface. We also explored a telepathy mode, in which participants could see from each others point of view. We implemented 3DSketch visual objects for participants to manipulate and move between virtual content boards in the environment. To evaluate the system, we conducted a study in which four people at a time used each of the three configurations to cooperate and communicate ideas with each other. We have provided experimental results and interview responses.
Adaptive Bit Rate (ABR) decision plays a crucial role for ensuring satisfactory Quality of Experience (QoE) in video streaming applications, in which past network statistics are mainly leveraged for future network bandwidth prediction. However, most algorithms, either rules-based or learning-driven approaches, feed throughput traces or classified traces based on traditional statistics (i.e., mean/standard deviation) to drive ABR decision, leading to compromised performances in specific scenarios. Given the diverse network connections (e.g., WiFi, cellular and wired link) from time to time, this paper thus proposes to learn the ANT (a.k.a., Accurate Network Throughput) model to characterize the full spectrum of network throughput dynamics in the past for deriving the proper network condition associated with a specific cluster of network throughput segments (NTS). Each cluster of NTS is then used to generate a dedicated ABR model, by which we wish to better capture the network dynamics for diverse connections. We have integrated the ANT model with existing reinforcement learning (RL)-based ABR decision engine, where different ABR models are applied to respond to the accurate network sensing for better rate decision. Extensive experiment results show that our approach can significantly improve the user QoE by 65.5% and 31.3% respectively, compared with the state-of-the-art Pensive and Oboe, across a wide range of network scenarios.
Recent years have seen an explosion in wireless video communication systems. Optimization in such systems is crucial - but most existing methods intended to optimize the performance of multi-user wireless video transmission are inefficient. Some works (e.g. Network Utility Maximization (NUM)) are myopic: they choose actions to maximize instantaneous video quality while ignoring the future impact of these actions. Such myopic solutions are known to be inferior to foresighted solutions that optimize the long-term video quality. Alternatively, foresighted solutions such as rate-distortion optimized packet scheduling focus on single-user wireless video transmission, while ignoring the resource allocation among the users. In this paper, we propose an optimal solution for performing joint foresighted resource allocation and packet scheduling among multiple users transmitting video over a shared wireless network. A key challenge in developing foresighted solutions for multiple video users is that the users decisions are coupled. To decouple the users decisions, we adopt a novel dual decomposition approach, which differs from the conventional optimization solutions such as NUM, and determines foresighted policies. Specifically, we propose an informationally-decentralized algorithm in which the network manager updates resource prices (i.e. the dual variables associated with the resource constraints), and the users make individual video packet scheduling decisions based on these prices. Because a priori knowledge of the system dynamics is almost never available at run-time, the proposed solution can learn online, concurrently with performing the foresighted optimization. Simulation results show 7 dB and 3 dB improvements in Peak Signal-to-Noise Ratio (PSNR) over myopic solutions and existing foresighted solutions, respectively.