ﻻ يوجد ملخص باللغة العربية
This paper presents a novel convolutional neural network (CNN) based image compression framework via scalable auto-encoder (SAE). Specifically, our SAE based deep image codec consists of hierarchical coding layers, each of which is an end-to-end optimized auto-encoder. The coarse image content and texture are encoded through the first (base) layer while the consecutive (enhance) layers iteratively code the pixel-level reconstruction errors between the original and former reconstructed images. The proposed SAE structure alleviates the need to train multiple models for different bit-rate points by recently proposed auto-encoder based codecs. The SAE layers can be combined to realize multiple rate points, or to produce a scalable stream. The proposed method has similar rate-distortion performance in the low-to-medium rate range as the state-of-the-art CNN based image codec (which uses different optimized networks to realize different bit rates) over a standard public image dataset. Furthermore, the proposed codec generates better perceptual quality in this bit rate range.
In this paper, we propose a learned scalable/progressive image compression scheme based on deep neural networks (DNN), named Bidirectional Context Disentanglement Network (BCD-Net). For learning hierarchical representations, we first adopt bit-plane
In this paper, we propose a cross-modal variational auto-encoder (CMVAE) for content-based micro-video background music recommendation. CMVAE is a hierarchical Bayesian generative model that matches relevant background music to a micro-video by proje
Terahertz (THz) sensing is a promising imaging technology for a wide variety of different applications. Extracting the interpretable and physically meaningful parameters for such applications, however, requires solving an inverse problem in which a m
Emerging applications in multiview streaming look for providing interactive navigation services to video players. The user can ask for information from any viewpoint with a minimum transmission delay. The purpose is to provide user with as much infor
The past few years have witnessed increasing interests in applying deep learning to video compression. However, the existing approaches compress a video frame with only a few number of reference frames, which limits their ability to fully exploit the