Fast Block Structure Determination in AV1-based Multiple Resolutions Video Encoding

122 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Bichuan Guo

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Bichuan Guo - Yuxing Han - Jiangtao Wen

الوسائط المتعددة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The widely used adaptive HTTP streaming requires an efficient algorithm to encode the same video to different resolutions. In this paper, we propose a fast block structure determination algorithm based on the AV1 codec that accelerates high resolution encoding, which is the bottle-neck of multiple resolutions encoding. The block structure similarity across resolutions is modeled by the fineness of frame detail and scale of object motions, this enables us to accelerate high resolution encoding based on low resolution encoding results. The average depth of a blocks co-located neighborhood is used to decide early termination in the RDO process. Encoding results show that our proposed algorithm reduces encoding time by 30.1%-36.8%, while keeping BD-rate low at 0.71%-1.04%. Comparing to the state-of-the-art, our method halves performance loss without sacrificing time savings.

قيم البحث

84 - Bichuan Guo , Xinyao Chen , Jiawen Gu 2018

Due to differences in frame structure, existing multi-rate video encoding algorithms cannot be directly adapted to encoders utilizing special reference frames such as AV1 without introducing substantial rate-distortion loss. To tackle this problem, w e propose a novel bayesian block structure inference model inspired by a modification to an HEVC-based algorithm. It estimates the posterior probabilistic distributions of block partitioning, and adapts early terminations in the RDO procedure accordingly. Experimental results show that the proposed method provides flexibility for controlling the tradeoff between speed and coding efficiency, and can achieve an average time saving of 36.1% (up to 50.6%) with negligible bitrate cost.

الوسائط المتعددة

CNN-based driving of block partitioning for intra slices encoding

97 - Franck Galpin , Fabien Racape , Sunil Jaiswal 2020

This paper provides a technical overview of a deep-learning-based encoder method aiming at optimizing next generation hybrid video encoders for driving the block partitioning in intra slices. An encoding approach based on Convolutional Neural Network s is explored to partly substitute classical heuristics-based encoder speed-ups by a systematic and automatic process. The solution allows controlling the trade-off between complexity and coding gains, in intra slices, with one single parameter. This algorithm was proposed at the Call for Proposals of the Joint Video Exploration Team (JVET) on video compression with capability beyond HEVC. In All Intra configuration, for a given allowed topology of splits, a speed-up of $times 2$ is obtained without BD-rate loss, or a speed-up above $times 4$ with a loss below 1% in BD-rate.

الوسائط المتعددة

A novel technique for image steganography based on Block-DCT and Huffman Encoding

494 - A.Nag 2010

Image steganography is the art of hiding information into a cover image. This paper presents a novel technique for Image steganography based on Block-DCT, where DCT is used to transform original image (cover image) blocks from spatial domain to frequ ency domain. Firstly a gray level image of size M x N is divided into no joint 8 x 8 blocks and a two dimensional Discrete Cosine Transform (2-d DCT) is performed on each of the P = MN / 64 blocks. Then Huffman encoding is also performed on the secret messages/images before embedding and each bit of Huffman code of secret message/image is embedded in the frequency domain by altering the least significant bit of each of the DCT coefficients of cover image blocks. The experimental results show that the algorithm has a high capacity and a good invisibility. Moreover PSNR of cover image with stego-image shows the better results in comparison with other existing steganography approaches. Furthermore, satisfactory security is maintained since the secret message/image cannot be extracted without knowing decoding rules and Huffman table.

الوسائط المتعددة

Content based video retrieval

380 - B. V. Patel , B. B. Meshram 2012

Content based video retrieval is an approach for facilitating the searching and browsing of large image collections over World Wide Web. In this approach, video analysis is conducted on low level visual properties extracted from video frame. We belie ved that in order to create an effective video retrieval system, visual perception must be taken into account. We conjectured that a technique which employs multiple features for indexing and retrieval would be more effective in the discrimination and search tasks of videos. In order to validate this claim, content based indexing and retrieval systems were implemented using color histogram, various texture features and other approaches. Videos were stored in Oracle 9i Database and a user study measured correctness of response.

الوسائط المتعددة الرؤية الحاسوبية وتمييز الأنماط

Video-based Point Cloud Compression Artifact Removal

171 - Anique Akhtar , Wen Gao , Li Li 2021

Photo-realistic point cloud capture and transmission are the fundamental enablers for immersive visual communication. The coding process of dynamic point clouds, especially video-based point cloud compression (V-PCC) developed by the MPEG standardiza tion group, is now delivering state-of-the-art performance in compression efficiency. V-PCC is based on the projection of the point cloud patches to 2D planes and encoding the sequence as 2D texture and geometry patch sequences. However, the resulting quantization errors from coding can introduce compression artifacts, which can be very unpleasant for the quality of experience (QoE). In this work, we developed a novel out-of-the-loop point cloud geometry artifact removal solution that can significantly improve reconstruction quality without additional bandwidth cost. Our novel framework consists of a point cloud sampling scheme, an artifact removal network, and an aggregation scheme. The point cloud sampling scheme employs a cube-based neighborhood patch extraction to divide the point cloud into patches. The geometry artifact removal network then processes these patches to obtain artifact-removed patches. The artifact-removed patches are then merged together using an aggregation scheme to obtain the final artifact-removed point cloud. We employ 3D deep convolutional feature learning for geometry artifact removal that jointly recovers both the quantization direction and the quantization noise level by exploiting projection and quantization prior. The simulation results demonstrate that the proposed method is highly effective and can considerably improve the quality of the reconstructed point cloud.

الوسائط المتعددة