Graph-based representation for multiview image coding

420 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Thomas Maugey

تاريخ النشر 2013

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Thomas Maugey - Antonio Ortega -

الوسائط المتعددة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this paper, we propose a new representation for multiview image sets. Our approach relies on graphs to describe geometry information in a compact and controllable way. The links of the graph connect pixels in different images and describe the proximity between pixels in the 3D space. These connections are dependent on the geometry of the scene and provide the right amount of information that is necessary for coding and reconstructing multiple views. This multiview image representation is very compact and adapts the transmitted geometry information as a function of the complexity of the prediction performed at the decoder side. To achieve this, our GBR adapts the accuracy of the geometry representation, in contrast with depth coding, which directly compresses with losses the original geometry signal. We present the principles of this graph-based representation (GBR) and we build a complete prototype coding scheme for multiview images. Experimental results demonstrate the potential of this new representation as compared to a depth-based approach. GBR can achieve a gain of 2 dB in reconstructed quality over depth-based schemes operating at similar rates.

قيم البحث

375 - Uday Takyar , Thomas Maugey , Pascal Frossard 2013

Emerging applications in multiview streaming look for providing interactive navigation services to video players. The user can ask for information from any viewpoint with a minimum transmission delay. The purpose is to provide user with as much infor mation as possible with least number of redundancies. The recent concept of navigation segment representation consists of regrouping a given number of viewpoints in one signal and transmitting them to the users according to their navigation path. The question of the best description strategy of these navigation segments is however still open. In this paper, we propose to represent and code navigation segments by a method that extends the recent layered depth image (LDI) format. It consists of describing the scene from a viewpoint with multiple images organized in layers corresponding to the different levels of occluded objects. The notion of extended LDI comes from the fact that the size of this image is adapted to take into account the sides of the scene also, in contrary to classical LDI. The obtained results show a significant rate-distortion gain compared to classical multiview compression approaches in navigation scenario.

الوسائط المتعددة

Navigation domain representation for interactive multiview imaging

502 - Thomas Maugey , Ismael Daribo , Gene Cheung 2012

Enabling users to interactively navigate through different viewpoints of a static scene is a new interesting functionality in 3D streaming systems. While it opens exciting perspectives towards rich multimedia applications, it requires the design of n ovel representations and coding techniques in order to solve the new challenges imposed by interactive navigation. Interactivity clearly brings new design constraints: the encoder is unaware of the exact decoding process, while the decoder has to reconstruct information from incomplete subsets of data since the server can generally not transmit images for all possible viewpoints due to resource constrains. In this paper, we propose a novel multiview data representation that permits to satisfy bandwidth and storage constraints in an interactive multiview streaming system. In particular, we partition the multiview navigation domain into segments, each of which is described by a reference image and some auxiliary information. The auxiliary information enables the client to recreate any viewpoint in the navigation segment via view synthesis. The decoder is then able to navigate freely in the segment without further data request to the server; it requests additional data only when it moves to a different segment. We discuss the benefits of this novel representation in interactive navigation systems and further propose a method to optimize the partitioning of the navigation domain into independent segments, under bandwidth and storage constraints. Experimental results confirm the potential of the proposed representation; namely, our system leads to similar compression performance as classical inter-view coding, while it provides the high level of flexibility that is required for interactive streaming. Hence, our new framework represents a promising solution for 3D data representation in novel interactive multimedia services.

الوسائط المتعددة الرؤية الحاسوبية وتمييز الأنماط

Optimal Layered Representation for Adaptive Interactive Multiview Video Streaming

544 - Ana De Abreu , Laura Toni , Nikolaos Thomos 2015

We consider an interactive multiview video streaming (IMVS) system where clients select their preferred viewpoint in a given navigation window. To provide high quality IMVS, many high quality views should be transmitted to the clients. However, this is not always possible due to the limited and heterogeneous capabilities of the clients. In this paper, we propose a novel adaptive IMVS solution based on a layered multiview representation where camera views are organized into layered subsets to match the different clients constraints. We formulate an optimization problem for the joint selection of the views subsets and their encoding rates. Then, we propose an optimal and a reduced computational complexity greedy algorithms, both based on dynamic-programming. Simulation results show the good performance of our novel algorithms compared to a baseline algorithm, proving that an effective IMVS adaptive solution should consider the scene content and the client capabilities and their preferences in navigation.

الوسائط المتعددة

Efficient Compressed Sensing Based Image Coding by Using Gray Transformation

132 - Bo Zhang , Di Xiao , Lan Wang 2021

In recent years, compressed sensing (CS) based image coding has become a hot topic in image processing field. However, since the bit depth required for encoding each CS sample is too large, the compression performance of this paradigm is unattractive . To address this issue, a novel CS-based image coding system by using gray transformation is proposed. In the proposed system, we use a gray transformation to preprocess the original image firstly and then use CS to sample the transformed image. Since gray transformation makes the probability distribution of CS samples centralized, the bit depth required for encoding each CS sample is reduced significantly. Consequently, the proposed system can considerably improve the compression performance of CS-based image coding. Simulation results show that the proposed system outperforms the traditional one without using gray transformation in terms of compression performance.

الوسائط المتعددة

On basis images for the digital image representation

104 - V.N. Gorbachev , L.A. Denisov , E.M. Kaynarova 2019

Digital array orthogonal transformations that can be presented as a decomposition over basis items or basis images are considered. The orthogonal transform provides digital data scattering, a process of pixel energy redistributing, that is illustrate d with the help of basis images. Data scattering plays important role for applications as image coding and watermarking. We established a simple quantum analogues of basis images. They are representations of quantum operators that describe transition of single particle between its states. Considering basis images as items of a matrix, we introduced a block matrix that is suitable for orthogonal transforms of multi-dimensional arrays such as block vector, components of which are matrices. We present an orthogonal transform that produces correlation between arrays. Due to correlation new feature of data scattering was found. A presented detection algorithm is an example of how it can be used in frequency domain watermarking.

الوسائط المتعددة