No Arabic abstract
Recent research has demonstrated that adding some imperceptible perturbations to original images can fool deep learning models. However, the current adversarial perturbations are usually shown in the form of noises, and thus have no practical meaning. Image watermark is a technique widely used for copyright protection. We can regard image watermark as a king of meaningful noises and adding it to the original image will not affect peoples understanding of the image content, and will not arouse peoples suspicion. Therefore, it will be interesting to generate adversarial examples using watermarks. In this paper, we propose a novel watermark perturbation for adversarial examples (Adv-watermark) which combines image watermarking techniques and adversarial example algorithms. Adding a meaningful watermark to the clean images can attack the DNN models. Specifically, we propose a novel optimization algorithm, which is called Basin Hopping Evolution (BHE), to generate adversarial watermarks in the black-box attack mode. Thanks to the BHE, Adv-watermark only requires a few queries from the threat models to finish the attacks. A series of experiments conducted on ImageNet and CASIA-WebFace datasets show that the proposed method can efficiently generate adversarial examples, and outperforms the state-of-the-art attack methods. Moreover, Adv-watermark is more robust against image transformation defense methods.
Recently, a self-embedding fragile watermark scheme based on reference-bits interleaving and adaptive selection of embedding mode was proposed. Reference bits are derived from the scrambled MSB bits of a cover image, and then are combined with authentication bits to form the watermark bits for LSB embedding. We find this algorithm has a feature of block independence of embedding watermark such that it is vulnerable to a collage attack. In addition, because the generation of authentication bits via hash function operations is not related to secret keys, we analyze this algorithm by a multiple stego-image attack. We find that the cost of obtaining all the permutation relations of $lcdot b^2$ watermark bits of each block (i.e., equivalent permutation keys) is about $(lcdot b^2)!$ for the embedding mode $(m, l)$, where $m$ MSB layers of a cover image are used for generating reference bits and $l$ LSB layers for embedding watermark, and $btimes b$ is the size of image block. The simulation results and the statistical results demonstrate our analysis is effective.
Malicious application of deepfakes (i.e., technologies can generate target faces or face attributes) has posed a huge threat to our society. The fake multimedia content generated by deepfake models can harm the reputation and even threaten the property of the person who has been impersonated. Fortunately, the adversarial watermark could be used for combating deepfake models, leading them to generate distorted images. The existing methods require an individual training process for every facial image, to generate the adversarial watermark against a specific deepfake model, which are extremely inefficient. To address this problem, we propose a universal adversarial attack method on deepfake models, to generate a Cross-Model Universal Adversarial Watermark (CMUA-Watermark) that can protect thousands of facial images from multiple deepfake models. Specifically, we first propose a cross-model universal attack pipeline by attacking multiple deepfake models and combining gradients from these models iteratively. Then we introduce a batch-based method to alleviate the conflict of adversarial watermarks generated by different facial images. Finally, we design a more reasonable and comprehensive evaluation method for evaluating the effectiveness of the adversarial watermark. Experimental results demonstrate that the proposed CMUA-Watermark can effectively distort the fake facial images generated by deepfake models and successfully protect facial images from deepfakes in real scenes.
Digital watermarking has been widely used to protect the copyright and integrity of multimedia data. Previous studies mainly focus on designing watermarking techniques that are robust to attacks of destroying the embedded watermarks. However, the emerging deep learning based image generation technology raises new open issues that whether it is possible to generate fake watermarked images for circumvention. In this paper, we make the first attempt to develop digital image watermark fakers by using generative adversarial learning. Suppose that a set of paired images of original and watermarked images generated by the targeted watermarker are available, we use them to train a watermark faker with U-Net as the backbone, whose input is an original image, and after a domain-specific preprocessing, it outputs a fake watermarked image. Our experiments show that the proposed watermark faker can effectively crack digital image watermarkers in both spatial and frequency domains, suggesting the risk of such forgery attacks.
Training deep neural networks from scratch could be computationally expensive and requires a lot of training data. Recent work has explored different watermarking techniques to protect the pre-trained deep neural networks from potential copyright infringements. However, these techniques could be vulnerable to watermark removal attacks. In this work, we propose REFIT, a unified watermark removal framework based on fine-tuning, which does not rely on the knowledge of the watermarks, and is effective against a wide range of watermarking schemes. In particular, we conduct a comprehensive study of a realistic attack scenario where the adversary has limited training data, which has not been emphasized in prior work on attacks against watermarking schemes. To effectively remove the watermarks without compromising the model functionality under this weak threat model, we propose two techniques that are incorporated into our fine-tuning framework: (1) an adaption of the elastic weight consolidation (EWC) algorithm, which is originally proposed for mitigating the catastrophic forgetting phenomenon; and (2) unlabeled data augmentation (AU), where we leverage auxiliary unlabeled data from other sources. Our extensive evaluation shows the effectiveness of REFIT against diverse watermark embedding schemes. In particular, both EWC and AU significantly decrease the amount of labeled training data needed for effective watermark removal, and the unlabeled data samples used for AU do not necessarily need to be drawn from the same distribution as the benign data for model evaluation. The experimental results demonstrate that our fine-tuning based watermark removal attacks could pose real threats to the copyright of pre-trained models, and thus highlight the importance of further investigating the watermarking problem and proposing more robust watermark embedding schemes against the attacks.
The embedder and the detector (or decoder) are the two most important components of the digital watermarking systems. Thus in this work, we discuss how to design a better embedder and detector (or decoder). I first give a summary of the prospective applications of watermarking technology and major watermarking schemes in the literature. My review on the literature closely centers upon how the side information is exploited at both embedders and detectors. In Chapter 3, I explore the optimum detector or decoder according to a particular probability distribution of the host signals. We found that the performance of both multiplicative and additive spread spectrum schemes depends on the shape parameter of the host signals. For spread spectrum schemes, the performance of the detector or the decoder is reduced by the host interference. Thus I present a new host-interference rejection technique for the multiplicative spread spectrum schemes. Its embedding rule is tailored to the optimum detection or decoding rule. Though the host interference rejection schemes enjoy a big performance gain over the traditional spread spectrum schemes, their drawbacks that it is difficult for them to be implemented with the perceptual analysis to achieve the maximum allowable embedding level discourage their use in real scenarios. Thus, in the last chapters of this work, I introduce a double-sided technique to tackle this drawback. It differs from the host interference rejection schemes in that it utilizes but does not reject the host interference at its embedder. The perceptual analysis can be easily implemented in our scheme to achieve the maximum allowable level of embedding strength.