ﻻ يوجد ملخص باللغة العربية
Black-box adversarial attacks on video recognition models have been explored. Considering the temporal interactions between frames, a few methods try to select some key frames, and then perform attacks on them. Unfortunately, their selecting strategy is independent with the attacking step, resulting in the limited performance. Instead, we argue the frame selection phase is closely relevant with the attacking phase. The key frames should be adjusted according to the attacking results. For that, we formulate the black-box video attacks into Reinforcement Learning (RL) framework. Specifically, the environment in RL is set as the threat model, and the agent in RL plays the role of frame selecting. By continuously querying the threat models and receiving the attacking feedback, the agent gradually adjusts its frame selection strategy and adversarial perturbations become smaller and smaller. A series of experiments demonstrate that our method can significantly reduce the adversarial perturbations with efficient query times.
Nowadays, digital facial content manipulation has become ubiquitous and realistic with the success of generative adversarial networks (GANs), making face recognition (FR) systems suffer from unprecedented security concerns. In this paper, we investig
Deep models have shown their vulnerability when processing adversarial samples. As for the black-box attack, without access to the architecture and weights of the attacked model, training a substitute model for adversarial attacks has attracted wide
We study the problem of attacking video recognition models in the black-box setting, where the model information is unknown and the adversary can only make queries to detect the predicted top-1 class and its probability. Compared with the black-box a
Previous studies have verified that the functionality of black-box models can be stolen with full probability outputs. However, under the more practical hard-label setting, we observe that existing methods suffer from catastrophic performance degrada
Video summarization aims at generating concise video summaries from the lengthy videos, to achieve better user watching experience. Due to the subjectivity, purely supervised methods for video summarization may bring the inherent errors from the anno