No Arabic abstract
Inferring the quality of streaming video applications is important for Internet service providers, but the fact that most video streams are encrypted makes it difficult to do so. We develop models that infer quality metrics (ie, startup delay and resolution) for encrypted streaming video services. Our paper builds on previous work, but extends it in several ways. First, the model works in deployment settings where the video sessions and segments must be identified from a mix of traffic and the time precision of the collected traffic statistics is more coarse (eg, due to aggregation). Second, we develop a single composite model that works for a range of different services (i.e., Netflix, YouTube, Amazon, and Twitch), as opposed to just a single service. Third, unlike many previous models, the model performs predictions at finer granularity (eg, the precise startup delay instead of just detecting short versus long delays) allowing to draw better conclusions on the ongoing streaming quality. Fourth, we demonstrate the model is practical through a 16-month deployment in 66 homes and provide new insights about the relationships between Internet speed and the quality of the corresponding video streams, for a variety of services; we find that higher speeds provide only minimal improvements to startup delay and resolution.
The diversity of video delivery pipeline poses a grand challenge to the evaluation of adaptive bitrate (ABR) streaming algorithms and objective quality-of-experience (QoE) models. Here we introduce so-far the largest subject-rated database of its kind, namely WaterlooSQoE-IV, consisting of 1350 adaptive streaming videos created from diverse source contents, video encoders, network traces, ABR algorithms, and viewing devices. We collect human opinions for each video with a series of carefully designed subjective experiments. Subsequent data analysis and testing/comparison of ABR algorithms and QoE models using the database lead to a series of novel observations and interesting findings, in terms of the effectiveness of subjective experiment methodologies, the interactions between user experience and source content, viewing device and encoder type, the heterogeneities in the bias and preference of user experiences, the behaviors of ABR algorithms, and the performance of objective QoE models. Most importantly, our results suggest that a better objective QoE model, or a better understanding of human perceptual experience and behaviour, is the most dominating factor in improving the performance of ABR algorithms, as opposed to advanced optimization frameworks, machine learning strategies or bandwidth predictors, where a majority of ABR research has been focused on in the past decade. On the other hand, our performance evaluation of 11 QoE models shows only a moderate correlation between state-of-the-art QoE models and subjective ratings, implying rooms for improvement in both QoE modeling and ABR algorithms. The database is made publicly available at: url{https://ece.uwaterloo.ca/~zduanmu/waterloosqoe4/}.
Virtual reality systems today cannot yet stream immersive, retina-quality virtual reality video over a network. One of the greatest challenges to this goal is the sheer data rates required to transmit retina-quality video frames at high resolutions and frame rates. Recent work has leveraged the decay of visual acuity in human perception in novel gaze-contingent video compression techniques. In this paper, we show that reducing the motion-to-photon latency of a system itself is a key method for improving the compression ratio of gaze-contingent compression. Our key finding is that a client and streaming server system with sub-15ms latency can achieve 5x better compression than traditional techniques while also using simpler software algorithms than previous work.
Low-Power Wide-Area Network (LPWAN) is an enabling Internet-of-Things (IoT) technology that supports long-range, low-power, and low-cost connectivity to numerous devices. To avoid the crowd in the limited ISM band (where most LPWANs operate) and cost of licensed band, the recently proposed SNOW (Sensor Network over White Spaces) is a promising LPWAN platform that operates over the TV white spaces. As it is a very recent technology and is still in its infancy, the current SNOW implementation uses the USRP devices as LPWAN nodes, which has high costs (~$750 USD per device) and large form-factors, hindering its applicability in practical deployment. In this paper, we implement SNOW using low-cost, low form-factor, low-power, and widely available commercial off-the-shelf (COTS) devices to enable its practical and large-scale deployment. Our choice of the COTS device (TI CC13x0: CC1310 or CC1350) consequently brings down the cost and form-factor of a SNOW node by 25x and 10x, respectively. Such implementation of SNOW on the CC13x0 devices, however, faces a number of challenges to enable link reliability and communication range. Our implementation addresses these challenges by handling peak-to-average power ratio problem, channel state information estimation, carrier frequency offset estimation, and near-far power problem. Our deployment in the city of Detroit, Michigan demonstrates that CC13x0-based SNOW can achieve uplink and downlink throughputs of 11.2kbps and 4.8kbps per node, respectively, over a distance of 1km. Also, the overall throughput in the uplink increases linearly with the increase in the number of SNOW nodes.
Video privacy leakage is becoming an increasingly severe public problem, especially in cloud-based video surveillance systems. It leads to the new need for secure cloud-based video applications, where the video is encrypted for privacy protection. Despite some methods that have been proposed for encrypted video moving object detection and tracking, none has robust performance against complex and dynamic scenes. In this paper, we propose an efficient and robust privacy-preserving motion detection and multiple object tracking scheme for encrypted surveillance video bitstreams. By analyzing the properties of the video codec and format-compliant encryption schemes, we propose a new compressed-domain feature to capture motion information in complex surveillance scenarios. Based on this feature, we design an adaptive clustering algorithm for moving object segmentation with an accuracy of 4x4 pixels. We then propose a multiple object tracking scheme that uses Kalman filter estimation and adaptive measurement refinement. The proposed scheme does not require video decryption or full decompression and has a very low computation load. The experimental results demonstrate that our scheme achieves the best detection and tracking performance compared with existing works in the encrypted and compressed domain. Our scheme can be effectively used in complex surveillance scenarios with different challenges, such as camera movement/jitter, dynamic background, and shadows.
Many of the video streaming applications in todays Internet involve the distribution of content from a CDN source to a large population of interested clients. However, widespread support of IP multicast is unavailable due to technical and economical reasons, leaving the floor to application layer multicast which introduces excessive delays for the clients and increased traffic load for the network. This paper is concerned with the introduction of an SDN-based framework that allows the network controller to not only deploy IP multicast between a source and subscribers, but also control, via a simple northbound interface, the distributed set of sources where multiple- description coded (MDC) video content is available. We observe that for medium to heavy network loads, relative to the state-of-the-art, the SDN-based streaming multicast video framework increases the PSNR of the received video significantly, from a level that is practically unwatchable to one that has good quality.