BLACK-BOX ADVERSARIAL ATTACK WITH TRANSFERABLE MODEL-BASED EMBEDDING

Zhichao Huang, Tong Zhang

Research output: Contribution to conferenceConference Paperpeer-review

Abstract

We present a new method for black-box adversarial attack. Unlike previous methods that combined transfer-based and scored-based methods by using the gradient or initialization of a surrogate white-box model, this new method tries to learn a low-dimensional embedding using a pretrained model, and then performs efficient search within the embedding space to attack an unknown target network. The method produces adversarial perturbations with high level semantic patterns that are easily transferable. We show that this approach can greatly improve the query efficiency of black-box adversarial attack across different target network architectures. We evaluate our approach on MNIST, ImageNet and Google Cloud Vision API, resulting in a significant reduction on the number of queries. We also attack adversarially defended networks on CIFAR10 and ImageNet, where our method not only reduces the number of queries, but also improves the attack success rate.

Original languageEnglish
Publication statusPublished - 2020
Event8th International Conference on Learning Representations, ICLR 2020 - Addis Ababa, Ethiopia
Duration: 30 Apr 2020 → …

Conference

Conference8th International Conference on Learning Representations, ICLR 2020
Country/TerritoryEthiopia
CityAddis Ababa
Period30/04/20 → …

Bibliographical note

Publisher Copyright:
© 2020 8th International Conference on Learning Representations, ICLR 2020. All rights reserved.

Fingerprint

Dive into the research topics of 'BLACK-BOX ADVERSARIAL ATTACK WITH TRANSFERABLE MODEL-BASED EMBEDDING'. Together they form a unique fingerprint.

Cite this