ﻻ يوجد ملخص باللغة العربية
Recent research on reinforcement learning (RL) has suggested that trained agents are vulnerable to maliciously crafted adversarial samples. In this work, we show how such samples can be generalised from White-box and Grey-box attacks to a strong Black-box case, where the attacker has no knowledge of the agents, their training parameters and their training methods. We use sequence-to-sequence models to predict a single action or a sequence of future actions that a trained agent will make. First, we show our approximation model, based on time-series information from the agent, consistently predicts RL agents future actions with high accuracy in a Black-box setup on a wide range of games and RL algorithms. Second, we find that although adversarial samples are transferable from the target model to our RL agents, they often outperform random Gaussian noise only marginally. This highlights a serious methodological deficiency in previous work on such agents; random jamming should have been taken as the baseline for evaluation. Third, we propose a novel use for adversarial samplesin Black-box attacks of RL agents: they can be used to trigger a trained agent to misbehave after a specific time delay. This appears to be a genuinely new type of attack. It potentially enables an attacker to use devices controlled by RL agents as time bombs.
Advances in computing resources have resulted in the increasing complexity of cyber-physical systems (CPS). As the complexity of CPS evolved, the focus has shifted from traditional control methods to deep reinforcement learning-based (DRL) methods fo
To improve policy robustness of deep reinforcement learning agents, a line of recent works focus on producing disturbances of the environment. Existing approaches of the literature to generate meaningful disturbances of the environment are adversaria
Boundary based blackbox attack has been recognized as practical and effective, given that an attacker only needs to access the final model prediction. However, the query efficiency of it is in general high especially for high dimensional image data.
Inspired by recent work in attention models for image captioning and question answering, we present a soft attention model for the reinforcement learning domain. This model uses a soft, top-down attention mechanism to create a bottleneck in the agent
Collecting training data from untrusted sources exposes machine learning services to poisoning adversaries, who maliciously manipulate training data to degrade the model accuracy. When trained on offline datasets, poisoning adversaries have to inject