A nonlinear transform based analog video transmission framework

224 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yongtao Liu

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yongtao Liu - Xiaopeng Fan - Yang Wang

الوسائط المتعددة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Soft-cast, a cross-layer design for wireless video transmission, is proposed to solve the drawbacks of digital video transmission: threshold transmission framework achieving the same effect. Specifically, in encoder, we carry out power allocation on the transformed coefficients and encode the coefficients based on the new formulation of power distortion. In decoder, the process of LLSE estimator is also improved. Accompanied with the inverse nonlinear transform, DCT coefficients can be recovered depending on the scaling factors , LLSE estimator coefficients and metadata. Experiment results show that our proposed framework outperforms the Soft-cast in PSNR 1.08 dB and the MSSIM gain reaches to 2.35% when transmitting under the same bandwidth and total power.

قيم البحث

436 - Fangwen Fu , Mihaela van der Schaar 2009

In this paper, we formulate the collaborative multi-user wireless video transmission problem as a multi-user Markov decision process (MUMDP) by explicitly considering the users heterogeneous video traffic characteristics, time-varying network conditi ons and the resulting dynamic coupling between the wireless users. These environment dynamics are often ignored in existing multi-user video transmission solutions. To comply with the decentralized nature of wireless networks, we propose to decompose the MUMDP into local MDPs using Lagrangian relaxation. Unlike in conventional multi-user video transmission solutions stemming from the network utility maximization framework, the proposed decomposition enables each wireless user to individually solve its own dynamic cross-layer optimization (i.e. the local MDP) and the network coordinator to update the Lagrangian multipliers (i.e. resource prices) based on not only current, but also future resource needs of all users, such that the long-term video quality of all users is maximized. However, solving the MUMDP requires statistical knowledge of the experienced environment dynamics, which is often unavailable before transmission time. To overcome this obstacle, we then propose a novel online learning algorithm, which allows the wireless users to update their policies in multiple states during one time slot. This is different from conventional learning solutions, which often update one state per time slot. The proposed learning algorithm can significantly improve the learning performance, thereby dramatically improving the video quality experienced by the wireless users over time. Our simulation results demonstrate the efficiency of the proposed MUMDP framework as compared to conventional multi-user video transmission solutions.

الوسائط المتعددة

DVMark: A Deep Multiscale Framework for Video Watermarking

155 - Xiyang Luo , Yinxiao Li , Huiwen Chang 2021

Video watermarking embeds a message into a cover video in an imperceptible manner, which can be retrieved even if the video undergoes certain modifications or distortions. Traditional watermarking methods are often manually designed for particular ty pes of distortions and thus cannot simultaneously handle a broad spectrum of distortions. To this end, we propose a robust deep learning-based solution for video watermarking that is end-to-end trainable. Our model consists of a novel multiscale design where the watermarks are distributed across multiple spatial-temporal scales. It gains robustness against various distortions through a differentiable distortion layer, whereas non-differentiable distortions, such as popular video compression standards, are modeled by a differentiable proxy. Extensive evaluations on a wide variety of distortions show that our method outperforms traditional video watermarking methods as well as deep image watermarking models by a large margin. We further demonstrate the practicality of our method on a realistic video-editing application.

الوسائط المتعددة

DeepQoE: A unified Framework for Learning to Predict Video QoE

376 - Huaizheng Zhang , Han Hu , Guanyu Gao 2018

Motivated by the prowess of deep learning (DL) based techniques in prediction, generalization, and representation learning, we develop a novel framework called DeepQoE to predict video quality of experience (QoE). The end-to-end framework first uses a combination of DL techniques (e.g., word embeddings) to extract generalized features. Next, these features are combined and fed into a neural network for representation learning. Such representations serve as inputs for classification or regression tasks. Evaluating the performance of DeepQoE with two datasets, we show that for the small dataset, the accuracy of all shallow learning algorithm is improved by using the representation derived from DeepQoE. For the large dataset, our DeepQoE framework achieves significant performance improvement in comparison to the best baseline method (90.94% vs. 82.84%). Moreover, DeepQoE, also released as an open source tool, provides video QoE research much-needed flexibility in fitting different datasets, extracting generalized features, and learning representations.

الوسائط المتعددة

Content based video retrieval

383 - B. V. Patel , B. B. Meshram 2012

Content based video retrieval is an approach for facilitating the searching and browsing of large image collections over World Wide Web. In this approach, video analysis is conducted on low level visual properties extracted from video frame. We belie ved that in order to create an effective video retrieval system, visual perception must be taken into account. We conjectured that a technique which employs multiple features for indexing and retrieval would be more effective in the discrimination and search tasks of videos. In order to validate this claim, content based indexing and retrieval systems were implemented using color histogram, various texture features and other approaches. Videos were stored in Oracle 9i Database and a user study measured correctness of response.

الوسائط المتعددة الرؤية الحاسوبية وتمييز الأنماط

Graph Fourier Transform based Audio Zero-watermarking

214 - Longting Xu , Daiyu Huang , Syed Faham Ali Zaidi 2021

The frequent exchange of multimedia information in the present era projects an increasing demand for copyright protection. In this work, we propose a novel audio zero-watermarking technology based on graph Fourier transform for enhancing the robustne ss with respect to copyright protection. In this approach, the combined shift operator is used to construct the graph signal, upon which the graph Fourier analysis is performed. The selected maximum absolute graph Fourier coefficients representing the characteristics of the audio segment are then encoded into a feature binary sequence using K-means algorithm. Finally, the resultant feature binary sequence is XOR-ed with the watermark binary sequence to realize the embedding of the zero-watermarking. The experimental studies show that the proposed approach performs more effectively in resisting common or synchronization attacks than the existing state-of-the-art methods.

الوسائط المتعددة أنظمة الصوت في الحاسوب معالجة الصوت والكلام