Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Variational image compression with a scale hyperprior

120 0 0.0 ( 0 )

Download Cite

Added by Johannes Ball\\'e

Publication date 2018

fields Electronic Engineering Informatics Engineering

and research's language is English

Authors Johannes Balle - David Minnen - Saurabh Singh

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We describe an end-to-end trainable model for image compression based on variational autoencoders. The model incorporates a hyperprior to effectively capture spatial dependencies in the latent representation. This hyperprior relates to side information, a concept universal to virtually all modern image codecs, but largely unexplored in image compression using artificial neural networks (ANNs). Unlike existing autoencoder compression methods, our model trains a complex prior jointly with the underlying autoencoder. We demonstrate that this model leads to state-of-the-art image compression when measuring visual quality using the popular MS-SSIM index, and yields rate-distortion performance surpassing published ANN-based methods when evaluated using a more traditional metric based on squared error (PSNR). Furthermore, we provide a qualitative comparison of models trained for different distortion metrics.

rate research

Deep Stereo Image Compression with Decoder Side Information using Wyner Common Information

101 - Nitish Mital , Ezgi Ozyilkan , Ali Garjani 2021

We present a novel deep neural network (DNN) architecture for compressing an image when a correlated image is available as side information only at the decoder. This problem is known as distributed source coding (DSC) in information theory. In particular, we consider a pair of stereo images, which generally have high correlation with each other due to overlapping fields of view, and assume that one image of the pair is to be compressed and transmitted, while the other image is available only at the decoder. In the proposed architecture, the encoder maps the input image to a latent space, quantizes the latent representation, and compresses it using entropy coding. The decoder is trained to extract the Wyners common information between the input image and the correlated image from the latter. The received latent representation and the locally generated common information are passed through a decoder network to obtain an enhanced reconstruction of the input image. The common information provides a succinct representation of the relevant information at the receiver. We train and demonstrate the effectiveness of the proposed approach on the KITTI dataset of stereo image pairs. Our results show that the proposed architecture is capable of exploiting the decoder-only side information, and outperforms previous work on stereo image compression with decoder side information.

Image and Video Processing Information Theory Machine Learning

Sparse Multi-layer Image Approximation: Facial Image Compression

281 - Sohrab Ferdowsi , Svyatoslav Voloshynovskiy , Dimche Kostadinov 2015

We propose a scheme for multi-layer representation of images. The problem is first treated from an information-theoretic viewpoint where we analyze the behavior of different sources of information under a multi-layer data compression framework and compare it with a single-stage (shallow) structure. We then consider the image data as the source of information and link the proposed representation scheme to the problem of multi-layer dictionary learning for visual data. For the current work we focus on the problem of image compression for a special class of images where we report a considerable performance boost in terms of PSNR at high compression ratios in comparison with the JPEG2000 codec.

Computer Vision and Pattern Recognition Information Theory Information Theory

Lossless Image Compression Using a Multi-Scale Progressive Statistical Model

148 - Honglei Zhang , Francesco Cricri , Hamed R. Tavakoli 2021

Lossless image compression is an important technique for image storage and transmission when information loss is not allowed. With the fast development of deep learning techniques, deep neural networks have been used in this field to achieve a higher compression rate. Methods based on pixel-wise autoregressive statistical models have shown good performance. However, the sequential processing way prevents these methods to be used in practice. Recently, multi-scale autoregressive models have been proposed to address this limitation. Multi-scale approaches can use parallel computing systems efficiently and build practical systems. Nevertheless, these approaches sacrifice compression performance in exchange for speed. In this paper, we propose a multi-scale progressive statistical model that takes advantage of the pixel-wise approach and the multi-scale approach. We developed a flexible mechanism where the processing order of the pixels can be adjusted easily. Our proposed method outperforms the state-of-the-art lossless image compression methods on two large benchmark datasets by a significant margin without degrading the inference speed dramatically.

Image and Video Processing Computer Vision and Pattern Recognition Machine Learning

A Compression Objective and a Cycle Loss for Neural Image Compression

131 - Caglar Aytekin , Francesco Cricri , Antti Hallapuro 2019

In this manuscript we propose two objective terms for neural image compression: a compression objective and a cycle loss. These terms are applied on the encoder output of an autoencoder and are used in combination with reconstruction losses. The compression objective encourages sparsity and low entropy in the activations. The cycle loss term represents the distortion between encoder outputs computed from the original image and from the reconstructed image (code-domain distortion). We train different autoencoders by using the compression objective in combination with different losses: a) MSE, b) MSE and MSSSIM, c) MSE, MS-SSIM and cycle loss. We observe that images encoded by these differently-trained autoencoders fall into different points of the perception-distortion curve (while having similar bit-rates). In particular, MSE-only training favors low image-domain distortion, whereas cycle loss training favors high perceptual quality.

Image and Video Processing Machine Learning Machine Learning

Variable Rate Deep Image Compression with Modulated Autoencoder

107 - Fei Yang , Luis Herranz , Joost van de Weijer 2019

Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods are optimized for a single fixed rate-distortion tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bit rates. Addressing these limitations, we formulate the problem of variable rate-distortion optimization for deep image compression, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific rate-distortion tradeoff via a modulation network. Jointly training this modulated autoencoder and modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters.

Image and Video Processing Computer Vision and Pattern Recognition

comments

Fetching comments

Sham Higher Institute of Forensic Sciences and the Arabic language and Islamic studies and research

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Variational image compression with a scale hyperprior

Ask ChatGPT about the research

No Arabic abstract

Read More