ترغب بنشر مسار تعليمي؟ اضغط هنا

Rate-Utility Optimized Streaming of Volumetric Media for Augmented Reality

84   0   0.0 ( 0 )
 نشر من قبل Jounsup Park
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Volumetric media, popularly known as holograms, need to be delivered to users using both on-demand and live streaming, for new augmented reality (AR) and virtual reality (VR) experiences. As in video streaming, hologram streaming must support network adaptivity and fast startup, but must also moderate large bandwidths, multiple simultaneously streaming objects, and frequent user interaction, which requires low delay. In this paper, we introduce the first system to our knowledge designed specifically for streaming volumetric media. The system reduces bandwidth by introducing 3D tiles, and culling them or reducing their level of detail depending on their relation to the users view frustum and distance to the user. Our system reduces latency by introducing a window-based buffer, which in contrast to a queue-based buffer allows insertions near the head of the buffer rather than only at the tail of the buffer, to respond quickly to user interaction. To allocate bits between different tiles across multiple objects, we introduce a simple greedy yet provably optimal algorithm for rate-utility optimization. We introduce utility measures based not only on the underlying quality of the representation, but on the level of detail relative to the users viewpoint and device resolution. Simulation results show that the proposed algorithm provides superior quality compared to existing video-streaming approaches adapted to hologram streaming, in terms of utility and user experience over variable, throughput-constrained networks.



قيم البحث

اقرأ أيضاً

The recent rise of interest in Virtual Reality (VR) came with the availability of commodity commercial VR prod- ucts, such as the Head Mounted Displays (HMD) created by Oculus and other vendors. To accelerate the user adoption of VR headsets, content providers should focus on producing high quality immersive content for these devices. Similarly, multimedia streaming service providers should enable the means to stream 360 VR content on their platforms. In this study, we try to cover different aspects related to VR content representation, streaming, and quality assessment that will help establishing the basic knowledge of how to build a VR streaming system.
Modern media data such as 360 videos and light field (LF) images are typically captured in much higher dimensions than the observers visual displays. To efficiently browse high-dimensional media over bandwidth-constrained networks, a navigational str eaming model is considered: a client navigates the large media space by dictating a navigation path to a server, who in response transmits the corresponding pre-encoded media data units (MDU) to the client one-by-one in sequence. Intra-coding an MDU (I-MDU) would result in a large bitrate but I-MDU can be randomly accessed, while inter-coding an MDU (P-MDU) using another MDU as a predictor incurs a small coding cost but imposes an order where the predictor must be first transmitted and decoded. From a compression perspective, the technical challenge is: how to achieve coding gain via inter-coding of MDUs, while enabling adequate random access for satisfactory user navigation. To address this problem, we propose landmarks, a selection of key MDUs from the high-dimensional media. Using landmarks as predictors, nearby MDUs in local neighborhoods are intercoded, resulting in a predictive MDU structure with controlled coding cost. It means that any requested MDU can be decoded by at most transmitting a landmark and an inter-coded MDU, enabling navigational random access. To build a landmarked MDU structure, we employ tree-structured vector quantizer (TSVQ) to first optimize landmark locations, then iteratively add/remove inter-coded MDUs as refinements using a fast branch-and-bound technique. Taking interactive LF images and viewport adaptive 360 images as illustrative applications, and I-, P- and previously proposed merge frames to intra- and inter-code MDUs, we show experimentally that landmarked MDU structures can noticeably reduce the expected transmission cost compared with MDU structures without landmarks.
Adaptive streaming addresses the increasing and heterogenous demand of multimedia content over the Internet by offering several encod
Adaptive bitrate (ABR) streaming is the de facto solution for achieving smooth viewing experiences under unstable network conditions. However, most of the existing rate adaptation approaches for ABR are content-agnostic, without considering the seman tic information of the video content. Nevertheless, semantic information largely determines the informativeness and interestingness of the video content, and consequently affects the QoE for video streaming. One common case is that the user may expect higher quality for the parts of video content that are more interesting or informative so as to reduce video distortion and information loss, given that the overall bitrate budgets are limited. This creates two main challenges for such a problem: First, how to determine which parts of the video content are more interesting? Second, how to allocate bitrate budgets for different parts of the video content with different significances? To address these challenges, we propose a Content-of-Interest (CoI) based rate adaptation scheme for ABR. We first design a deep learning approach for recognizing the interestingness of the video content, and then design a Deep Q-Network (DQN) approach for rate adaptation by incorporating video interestingness information. The experimental results show that our method can recognize video interestingness precisely, and the bitrate allocation for ABR can be aligned with the interestingness of video content while not compromising the performances on objective QoE metrics.
In this paper, we study the server-side rate adaptation problem for streaming tile-based adaptive 360-degree videos to multiple users who are competing for transmission resources at the network bottleneck. Specifically, we develop a convolutional neu ral network (CNN)-based viewpoint prediction model to capture the nonlinear relationship between the future and historical viewpoints. A Laplace distribution model is utilized to characterize the probability distribution of the prediction error. Given the predicted viewpoint, we then map the viewport in the spherical space into its corresponding planar projection in the 2-D plane, and further derive the visibility probability of each tile based on the planar projection and the prediction error probability. According to the visibility probability, tiles are classified as viewport, marginal and invisible tiles. The server-side tile rate allocation problem for multiple users is then formulated as a non-linear discrete optimization problem to minimize the overall received video distortion of all users and the quality difference between the viewport and marginal tiles of each user, subject to the transmission capacity constraints and users specific viewport requirements. We develop a steepest descent algorithm to solve this non-linear discrete optimization problem, by initializing the feasible starting point in accordance with the optimal solution of its continuous relaxation. Extensive experimental results show that the proposed algorithm can achieve a near-optimal solution, and outperforms the existing rate adaptation schemes for tile-based adaptive 360-video streaming.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا