Attacking Text Classifiers via Sentence Rewriting Sampler


Abstract in English

Most adversarial attack methods on text classification can change the classifiers prediction by synonym substitution. We propose the adversarial sentence rewriting sampler (ASRS), which rewrites the whole sentence to generate more similar and higher-quality adversarial examples. Our method achieves a better attack success rate on 4 out of 7 datasets, as well as significantly better sentence quality on all 7 datasets. ASRS is an indispensable supplement to the existing attack methods, because classifiers cannot resist the attack from ASRS unless they are trained on adversarial examples found by ASRS.

Download