No Arabic abstract
Protecting the Intellectual Property Rights (IPR) associated to Deep Neural Networks (DNNs) is a pressing need pushed by the high costs required to train such networks and the importance that DNNs are gaining in our society. Following its use for Multimedia (MM) IPR protection, digital watermarking has recently been considered as a mean to protect the IPR of DNNs. While DNN watermarking inherits some basic concepts and methods from MM watermarking, there are significant differences between the two application areas, calling for the adaptation of media watermarking techniques to the DNN scenario and the development of completely new methods. In this paper, we overview the most recent advances in DNN watermarking, by paying attention to cast it into the bulk of watermarking theory developed during the last two decades, while at the same time highlighting the new challenges and opportunities characterizing DNN watermarking. Rather than trying to present a comprehensive description of all the methods proposed so far, we introduce a new taxonomy of DNN watermarking and present a few exemplary methods belonging to each class. We hope that this paper will inspire new research in this exciting area and will help researchers to focus on the most innovative and challenging problems in the field.
DNN watermarking is receiving an increasing attention as a suitable mean to protect the Intellectual Property Rights associated to DNN models. Several methods proposed so far are inspired to the popular Spread Spectrum (SS) paradigm according to which the watermark bits are embedded into the projection of the weights of the DNN model onto a pseudorandom sequence. In this paper, we propose a new DNN watermarking algorithm that leverages on the watermarking with side information paradigm to decrease the obtrusiveness of the watermark and increase its payload. In particular, the new scheme exploits the main ideas of ST-DM (Spread Transform Dither Modulation) watermarking to improve the performance of a recently proposed algorithm based on conventional SS. The experiments we carried out by applying the proposed scheme to watermark different models, demonstrate its capability to provide a higher payload with a lower impact on network accuracy than a baseline method based on conventional SS, while retaining a satisfactory level of robustness.
In order to protect the intellectual property (IP) of deep neural networks (DNNs), many existing DNN watermarking techniques either embed watermarks directly into the DNN parameters or insert backdoor watermarks by fine-tuning the DNN parameters, which, however, cannot resist against various attack methods that remove watermarks by altering DNN parameters. In this paper, we bypass such attacks by introducing a structural watermarking scheme that utilizes channel pruning to embed the watermark into the host DNN architecture instead of crafting the DNN parameters. To be specific, during watermark embedding, we prune the internal channels of the host DNN with the channel pruning rates controlled by the watermark. During watermark extraction, the watermark is retrieved by identifying the channel pruning rates from the architecture of the target DNN model. Due to the superiority of pruning mechanism, the performance of the DNN model on its original task is reserved during watermark embedding. Experimental results have shown that, the proposed work enables the embedded watermark to be reliably recovered and provides a high watermark capacity, without sacrificing the usability of the DNN model. It is also demonstrated that the work is robust against common transforms and attacks designed for conventional watermarking approaches.
Machine learning (ML) models are applied in an increasing variety of domains. The availability of large amounts of data and computational resources encourages the development of ever more complex and valuable models. These models are considered intellectual property of the legitimate parties who have trained them, which makes their protection against stealing, illegitimate redistribution, and unauthorized application an urgent need. Digital watermarking presents a strong mechanism for marking model ownership and, thereby, offers protection against those threats. The emergence of numerous watermarking schemes and attacks against them is pushed forward by both academia and industry, which motivates a comprehensive survey on this field. This document at hand provides the first extensive literature review on ML model watermarking schemes and attacks against them. It offers a taxonomy of existing approaches and systemizes general knowledge around them. Furthermore, it assembles the security requirements to watermarking approaches and evaluates schemes published by the scientific community according to them in order to present systematic shortcomings and vulnerabilities. Thus, it can not only serve as valuable guidance in choosing the appropriate scheme for specific scenarios, but also act as an entry point into developing new mechanisms that overcome presented shortcomings, and thereby contribute in advancing the field.
The state of the art performance of deep learning models comes at a high cost for companies and institutions, due to the tedious data collection and the heavy processing requirements. Recently, [35, 22] proposed to watermark convolutional neural networks for image classification, by embedding information into their weights. While this is a clear progress towards model protection, this technique solely allows for extracting the watermark from a network that one accesses locally and entirely. Instead, we aim at allowing the extraction of the watermark from a neural network (or any other machine learning model) that is operated remotely, and available through a service API. To this end, we propose to mark the models action itself, tweaking slightly its decision frontiers so that a set of specific queries convey the desired information. In the present paper, we formally introduce the problem and propose a novel zero-bit watermarking algorithm that makes use of adversarial model examples. While limiting the loss of performance of the protected model, this algorithm allows subsequent extraction of the watermark using only few queries. We experimented the approach on three neural networks designed for image classification, in the context of MNIST digit recognition task.
The rise of machine learning as a service and model sharing platforms has raised the need of traitor-tracing the models and proof of authorship. Watermarking technique is the main component of existing methods for protecting copyright of models. In this paper, we show that distillation, a widely used transformation technique, is a quite effective attack to remove watermark embedded by existing algorithms. The fragility is due to the fact that distillation does not retain the watermark embedded in the model that is redundant and independent to the main learning task. We design ingrain in response to the destructive distillation. It regularizes a neural network with an ingrainer model, which contains the watermark, and forces the model to also represent the knowledge of the ingrainer. Our extensive evaluations show that ingrain is more robust to distillation attack and its robustness against other widely used transformation techniques is comparable to existing methods.