Convolutional Video Steganography with Temporal Residual Modeling


Abstract in English

Steganography represents the art of unobtrusively concealing a secrete message within some cover data. The key scope of this work is about visual steganography techniques that hide a full-sized color image / video within another. A majority of existing works are devoted to the image case, where both secret and cover data are images. We empirically validate that image steganography model does not naturally extend to the video case (i.e., hiding a video into another video), mainly because it completely ignores the temporal redundancy within consecutive video frames. Our work proposes a novel solution to the problem of video steganography. The technical contributions are two-fold: first, the residual between two consecutive frames tends to zero at most pixels. Hiding such highly-sparse data is significantly easier than hiding the original frames. Motivated by this fact, we propose to explicitly consider inter-frame residuals rather than blindly applying image steganography model on every video frame. Specifically, our model contains two branches, one of which is specially designed for hiding inter-frame difference into a cover video frame and the other instead hides the original secret frame. A simple thresholding method determines which branch a secret video frame shall choose. When revealing the concealed secret video, two decoders are devised, revealing difference or frame respectively. Second, we develop the model based on deep convolutional neural networks, which is the first of its kind in the literature of video steganography. In experiments, comprehensive evaluations are conducted to compare our model with both classic least significant bit (LSB) method and pure image steganography models. All results strongly suggest that the proposed model enjoys advantages over previous methods. We also carefully investigate key factors in the success of our deep video steganography model.

Download