Hidden Backdoor Attack against Semantic Segmentation Models


Abstract in English

Deep neural networks (DNNs) are vulnerable to the emph{backdoor attack}, which intends to embed hidden backdoors in DNNs by poisoning training data. The attacked model behaves normally on benign samples, whereas its prediction will be changed to a particular target label if hidden backdoors are activated. So far, backdoor research has mostly been conducted towards classification tasks. In this paper, we reveal that this threat could also happen in semantic segmentation, which may further endanger many mission-critical applications ($e.g.$, autonomous driving). Except for extending the existing attack paradigm to maliciously manipulate the segmentation models from the image-level, we propose a novel attack paradigm, the emph{fine-grained attack}, where we treat the target label ($i.e.$, annotation) from the object-level instead of the image-level to achieve more sophisticated manipulation. In the annotation of poisoned samples generated by the fine-grained attack, only pixels of specific objects will be labeled with the attacker-specified target class while others are still with their ground-truth ones. Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data. Our method not only provides a new perspective for designing novel attacks but also serves as a strong baseline for improving the robustness of semantic segmentation methods.

Download