ترغب بنشر مسار تعليمي؟ اضغط هنا

Low-complexity 8-point DCT Approximation Based on Angle Similarity for Image and Video Coding

181   0   0.0 ( 0 )
 نشر من قبل Renato J Cintra
 تاريخ النشر 2018
والبحث باللغة English




اسأل ChatGPT حول البحث

The principal component analysis (PCA) is widely used for data decorrelation and dimensionality reduction. However, the use of PCA may be impractical in real-time applications, or in situations were energy and computing constraints are severe. In this context, the discrete cosine transform (DCT) becomes a low-cost alternative to data decorrelation. This paper presents a method to derive computationally efficient approximations to the DCT. The proposed method aims at the minimization of the angle between the rows of the exact DCT matrix and the rows of the approximated transformation matrix. The resulting transformations matrices are orthogonal and have extremely low arithmetic complexity. Considering popular performance measures, one of the proposed transformation matrices outperforms the best competitors in both matrix error and coding capabilities. Practical applications in image and video coding demonstrate the relevance of the proposed transformation. In fact, we show that the proposed approximate DCT can outperform the exact DCT for image encoding under certain compression ratios. The proposed transform and its direct competitors are also physically realized as digital prototype circuits using FPGA technology.



قيم البحث

اقرأ أيضاً

Discrete transforms play an important role in many signal processing applications, and low-complexity alternatives for classical transforms became popular in recent years. Particularly, the discrete cosine transform (DCT) has proven to be convenient for data compression, being employed in well-known image and video coding standards such as JPEG, H.264, and the recent high efficiency video coding (HEVC). In this paper, we introduce a new class of low-complexity 8-point DCT approximations based on a series of works published by Bouguezel, Ahmed and Swamy. Also, a multiparametric fast algorithm that encompasses both known and novel transforms is derived. We select the best-performing DCT approximations after solving a multicriteria optimization problem, and submit them to a scaling method for obtaining larger size transforms. We assess these DCT approximations in both JPEG-like image compression and video coding experiments. We show that the optimal DCT approximations present compelling results in terms of coding efficiency and image quality metrics, and require only few addition or bit-shifting operations, being suitable for low-complexity and low-power systems.
110 - Qi Yang , Zhan Ma , Yiling Xu 2020
We propose the GraphSIM -- an objective metric to accurately predict the subjective quality of point cloud with superimposed geometry and color impairments. Motivated by the facts that human vision system is more sensitive to the high spatial-frequen cy components (e.g., contours, edges), and weighs more to the local structural variations rather individual point intensity, we first extract geometric keypoints by resampling the reference point cloud geometry information to form the object skeleton; we then construct local graphs centered at these keypoints for both reference and distorted point clouds, followed by collectively aggregating color gradient moments (e.g., zeroth, first, and second) that are derived between all other points and centered keypoint in the same local graph for significant feature similarity (a.k.a., local significance) measurement; Final similarity index is obtained by pooling the local graph significance across all color channels and by averaging across all graphs. Our GraphSIM is validated using two large and independent point cloud assessment datasets that involve a wide range of impairments (e.g., re-sampling, compression, additive noise), reliably demonstrating the state-of-the-art performance for all distortions with noticeable gains in predicting the subjective mean opinion score (MOS), compared with those point-wise distance-based metrics adopted in standardization reference software. Ablation studies have further shown that GraphSIM is generalized to various scenarios with consistent performance by examining its key modules and parameters.
Versatile Video Coding (VVC) is the most recent international video coding standard jointly developed by ITU-T and ISO/IEC, which has been finalized in July 2020. VVC allows for significant bit-rate reductions around 50% for the same subjective video quality compared to its predecessor, High Efficiency Video Coding (HEVC). One year after finalization, VVC support in devices and chipsets is still under development, which is aligned with the typical development cycles of new video coding standards. This paper presents open-source software packages that allow building a complete VVC end-to-end toolchain already one year after its finalization. This includes the Fraunhofer HHI VVenC library for fast and efficient VVC encoding as well as HHIs VVdeC library for live decoding. An experimental integration of VVC in the GPAC software tools and FFmpeg media framework allows packaging VVC bitstreams, e.g. encoded with VVenC, in MP4 file format and using DASH for content creation and streaming. The integration of VVdeC allows playback on the receiver. Given these packages, step-by-step tutorials are provided for two possible application scenarios: VVC file encoding plus playback and adaptive streaming with DASH.
642 - Zhengfang Duanmu 2019
Rate-distortion (RD) theory is at the heart of lossy data compression. Here we aim to model the generalized RD (GRD) trade-off between the visual quality of a compressed video and its encoding profiles (e.g., bitrate and spatial resolution). We first define the theoretical functional space $mathcal{W}$ of the GRD function by analyzing its mathematical properties.We show that $mathcal{W}$ is a convex set in a Hilbert space, inspiring a computational model of the GRD function, and a method of estimating model parameters from sparse measurements. To demonstrate the feasibility of our idea, we collect a large-scale database of real-world GRD functions, which turn out to live in a low-dimensional subspace of $mathcal{W}$. Combining the GRD reconstruction framework and the learned low-dimensional space, we create a low-parameter eigen GRD method to accurately estimate the GRD function of a source video content from only a few queries. Experimental results on the database show that the learned GRD method significantly outperforms state-of-the-art empirical RD estimation methods both in accuracy and efficiency. Last, we demonstrate the promise of the proposed model in video codec comparison.
102 - Jian Yue , Yanbo Gao , Shuai Li 2021
In-loop filtering is used in video coding to process the reconstructed frame in order to remove blocking artifacts. With the development of convolutional neural networks (CNNs), CNNs have been explored for in-loop filtering considering it can be trea ted as an image de-noising task. However, in addition to being a distorted image, the reconstructed frame is also obtained by a fixed line of block based encoding operations in video coding. It carries coding-unit based coding distortion of some similar characteristics. Therefore, in this paper, we address the filtering problem from two aspects, global appearance restoration for disrupted texture and local coding distortion restoration caused by fixed pipeline of coding. Accordingly, a three-stream global appearance and local coding distortion based fusion network is developed with a high-level global feature stream, a high-level local feature stream and a low-level local feature stream. Ablation study is conducted to validate the necessity of different features, demonstrating that the global features and local features can complement each other in filtering and achieve better performance when combined. To the best of our knowledge, we are the first one that clearly characterizes the video filtering process from the above global appearance and local coding distortion restoration aspects with experimental verification, providing a clear pathway to developing filter techniques. Experimental results demonstrate that the proposed method significantly outperforms the existing single-frame based methods and achieves 13.5%, 11.3%, 11.7% BD-Rate saving on average for AI, LDP and RA configurations, respectively, compared with the HEVC reference software.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا