No Arabic abstract
Adaptive Bit Rate (ABR) decision plays a crucial role for ensuring satisfactory Quality of Experience (QoE) in video streaming applications, in which past network statistics are mainly leveraged for future network bandwidth prediction. However, most algorithms, either rules-based or learning-driven approaches, feed throughput traces or classified traces based on traditional statistics (i.e., mean/standard deviation) to drive ABR decision, leading to compromised performances in specific scenarios. Given the diverse network connections (e.g., WiFi, cellular and wired link) from time to time, this paper thus proposes to learn the ANT (a.k.a., Accurate Network Throughput) model to characterize the full spectrum of network throughput dynamics in the past for deriving the proper network condition associated with a specific cluster of network throughput segments (NTS). Each cluster of NTS is then used to generate a dedicated ABR model, by which we wish to better capture the network dynamics for diverse connections. We have integrated the ANT model with existing reinforcement learning (RL)-based ABR decision engine, where different ABR models are applied to respond to the accurate network sensing for better rate decision. Extensive experiment results show that our approach can significantly improve the user QoE by 65.5% and 31.3% respectively, compared with the state-of-the-art Pensive and Oboe, across a wide range of network scenarios.
We consider an interactive multiview video streaming (IMVS) system where clients select their preferred viewpoint in a given navigation window. To provide high quality IMVS, many high quality views should be transmitted to the clients. However, this is not always possible due to the limited and heterogeneous capabilities of the clients. In this paper, we propose a novel adaptive IMVS solution based on a layered multiview representation where camera views are organized into layered subsets to match the different clients constraints. We formulate an optimization problem for the joint selection of the views subsets and their encoding rates. Then, we propose an optimal and a reduced computational complexity greedy algorithms, both based on dynamic-programming. Simulation results show the good performance of our novel algorithms compared to a baseline algorithm, proving that an effective IMVS adaptive solution should consider the scene content and the client capabilities and their preferences in navigation.
Interactive multi-view video streaming (IMVS) services permit to remotely immerse within a 3D scene. This is possible by transmitting a set of reference camera views (anchor views), which are used by the clients to freely navigate in the scene and possibly synthesize additional viewpoints of interest. From a networking perspective, the big challenge in IMVS systems is to deliver to each client the best set of anchor views that maximizes the navigation quality, minimizes the view-switching delay and yet satisfies the network constraints. Integrating adaptive streaming solutions in free-viewpoint systems offers a promising solution to deploy IMVS in large and heterogeneous scenarios, as long as the multi-view video representations on the server are properly selected. We therefore propose to optimize the multi-view data at the server by minimizing the overall resource requirements, yet offering a good navigation quality to the different users. We propose a video representation set optimization for multiview adaptive streaming systems and we show that it is NP-hard. We therefore introduce the concept of multi-view navigation segment that permits to cast the video representation set selection as an integer linear programming problem with a bounded computational complexity. We then show that the proposed solution reduces the computational complexity while preserving optimality in most of the 3D scenes. We then provide simulation results for different classes of users and show the gain offered by an optimal multi-view video representation selection compared to recommended representation sets (e.g., Netflix and Apple ones) or to a baseline representation selection algorithm where the encoding parameters are decided a priori for all the views.
Adaptive bitrate (ABR) streaming is the de facto solution for achieving smooth viewing experiences under unstable network conditions. However, most of the existing rate adaptation approaches for ABR are content-agnostic, without considering the semantic information of the video content. Nevertheless, semantic information largely determines the informativeness and interestingness of the video content, and consequently affects the QoE for video streaming. One common case is that the user may expect higher quality for the parts of video content that are more interesting or informative so as to reduce video distortion and information loss, given that the overall bitrate budgets are limited. This creates two main challenges for such a problem: First, how to determine which parts of the video content are more interesting? Second, how to allocate bitrate budgets for different parts of the video content with different significances? To address these challenges, we propose a Content-of-Interest (CoI) based rate adaptation scheme for ABR. We first design a deep learning approach for recognizing the interestingness of the video content, and then design a Deep Q-Network (DQN) approach for rate adaptation by incorporating video interestingness information. The experimental results show that our method can recognize video interestingness precisely, and the bitrate allocation for ABR can be aligned with the interestingness of video content while not compromising the performances on objective QoE metrics.
In this paper, we study the server-side rate adaptation problem for streaming tile-based adaptive 360-degree videos to multiple users who are competing for transmission resources at the network bottleneck. Specifically, we develop a convolutional neural network (CNN)-based viewpoint prediction model to capture the nonlinear relationship between the future and historical viewpoints. A Laplace distribution model is utilized to characterize the probability distribution of the prediction error. Given the predicted viewpoint, we then map the viewport in the spherical space into its corresponding planar projection in the 2-D plane, and further derive the visibility probability of each tile based on the planar projection and the prediction error probability. According to the visibility probability, tiles are classified as viewport, marginal and invisible tiles. The server-side tile rate allocation problem for multiple users is then formulated as a non-linear discrete optimization problem to minimize the overall received video distortion of all users and the quality difference between the viewport and marginal tiles of each user, subject to the transmission capacity constraints and users specific viewport requirements. We develop a steepest descent algorithm to solve this non-linear discrete optimization problem, by initializing the feasible starting point in accordance with the optimal solution of its continuous relaxation. Extensive experimental results show that the proposed algorithm can achieve a near-optimal solution, and outperforms the existing rate adaptation schemes for tile-based adaptive 360-video streaming.
With the merit of containing full panoramic content in one camera, Virtual Reality (VR) and 360-degree videos have attracted more and more attention in the field of industrial cloud manufacturing and training. Industrial Internet of Things (IoT), where many VR terminals needed to be online at the same time, can hardly guarantee VRs bandwidth requirement. However, by making use of users quality of experience (QoE) awareness factors, including the relative moving speed and depth difference between the viewpoint and other content, bandwidth consumption can be reduced. In this paper, we propose OFB-VR (Optical Flow Based VR), an interactive method of VR streaming that can make use of VR users QoE awareness to ease the bandwidth pressure. The Just-Noticeable Difference through Optical Flow Estimation (JND-OFE) is explored to quantify users awareness of quality distortion in 360-degree videos. Accordingly, a novel 360-degree videos QoE metric based on PSNR and JND-OFE (PSNR-OF) is proposed. With the help of PSNR-OF, OFB-VR proposes a versatile-size tiling scheme to lessen the tiling overhead. A Reinforcement Learning(RL) method is implemented to make use of historical data to perform Adaptive BitRate(ABR). For evaluation, we take two prior VR streaming schemes, Pano and Plato, as baselines. Vast evaluations show that our system can increase the mean PSNR-OF score by 9.5-15.8% while maintaining the same rebuffer ratio compared with Pano and Plato in a fluctuate LTE bandwidth dataset. Evaluation results show that OFB-VR is a promising prototype for actual interactive industrial VR. A prototype of OFB-VR can be found in https://github.com/buptexplorers/OFB-VR.