ﻻ يوجد ملخص باللغة العربية
This paper studies bandit algorithms under data poisoning attacks in a bounded reward setting. We consider a strong attacker model in which the attacker can observe both the selected actions and their corresponding rewards, and can contaminate the rewards with additive noise. We show that emph{any} bandit algorithm with regret $O(log T)$ can be forced to suffer a regret $Omega(T)$ with an expected amount of contamination $O(log T)$. This amount of contamination is also necessary, as we prove that there exists an $O(log T)$ regret bandit algorithm, specifically the classical UCB, that requires $Omega(log T)$ amount of contamination to suffer regret $Omega(T)$. To combat such poising attacks, our second main contribution is to propose a novel algorithm, Secure-UCB, which uses limited emph{verification} to access a limited number of uncontaminated rewards. We show that with $O(log T)$ expected number of verifications, Secure-UCB can restore the order optimal $O(log T)$ regret emph{irrespective of the amount of contamination} used by the attacker. Finally, we prove that for any bandit algorithm, this number of verifications $O(log T)$ is necessary to recover the order-optimal regret. We can then conclude that Secure-UCB is order-optimal in terms of both the expected regret and the expected number of verifications, and can save stochastic bandits from any data poisoning attack.
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker. As a victim, we consider RL agents whose objective is to find a policy t
As machine learning systems grow in scale, so do their training data requirements, forcing practitioners to automate and outsource the curation of training data in order to achieve state-of-the-art performance. The absence of trustworthy human superv
In reward-poisoning attacks against reinforcement learning (RL), an attacker can perturb the environment reward $r_t$ into $r_t+delta_t$ at each step, with the goal of forcing the RL agent to learn a nefarious policy. We categorize such attacks by th
In a poisoning attack, an adversary with control over a small fraction of the training data attempts to select that data in a way that induces a corrupted model that misbehaves in favor of the adversary. We consider poisoning attacks against convex m
Certifiers for neural networks have made great progress towards provable robustness guarantees against evasion attacks using adversarial examples. However, introducing certifiers into deep learning systems also opens up new attack vectors, which need