An intriguing property of deep neural networks is that adversarial attacks can transfer across different models. Existing methods such as the Intermediate Level Attack (ILA) further improve black-box transferability by fine-tuning a reference adversarial attack, so as to maximize the perturbation on a pre-specified layer of the source model. In this work, we revisit ILA and evaluate the effect of applying augmentation to the images before passing them to ILA. We start by looking into the effect of common image augmentation techniques and exploring novel augmentation with the aid of adversarial perturbations. Based on the observations, we propose Aug-ILA, an improved method that enhances the transferability of an existing attack under the ILA framework. Specifically, Aug-ILA has three main characteristics: typical image augmentation such as random cropping and resizing applied to all ILA inputs, reverse adversarial update on the clean image, and interpolation between two attacks on the reference image. Our experimental results show that Aug-ILA outperforms ILA and its subsequent variants, as well as state-of-the-art transfer-based attacks, by achieving 96.99% and 87.84% average attack success rates with perturbation budgets 13/255 (0.05) and 8/255 (0.03), respectively, on nine undefended models. Besides, being a strong transfer-based attack, Aug-ILA can also be adopted for adversarial training. We propose a two-phase training scheme which aims to both speed up the training time and also achieve better robustness compared to previous works. Having a pre-training phase using an existing framework, we further employ Aug-ILA to fine-tune the model. Extensive experiments illustrate that Aug-ILA can boost the model robustness up to 5% while the model can still converge in a reasonable time.
| Date of Award | 2022 |
|---|
| Original language | English |
|---|
| Awarding Institution | - The Hong Kong University of Science and Technology
|
|---|
| Supervisor | Dit Yan YEUNG (Supervisor) |
|---|
Aug-ILA : more transferable attacks and their application to adversarial training
YAN, C. W. (Author). 2022
Student thesis: Master's thesis