On the robustness of self-attentive models

Yu Lun Hsieh, Minhao Cheng, Da Cheng Juan, Wei Wei, Wen Lian Hsu, Cho Jui Hsieh

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

77 Citations (Scopus)

Abstract

This work examines the robustness of self-attentive neural networks against adversarial input perturbations. Specifically, we investigate the attention and feature extraction mechanisms of state-of-the-art recurrent neural networks and self-attentive architectures for sentiment analysis, entailment and machine translation under adversarial attacks. We also propose a novel attack algorithm for generating more natural adversarial examples that could mislead neural models but not humans. Experimental results show that, compared to recurrent neural models, self-attentive models are more robust against adversarial perturbation. In addition, we provide theoretical explanations for their superior robustness to support our claims.

Original languageEnglish
Title of host publicationACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages1520-1529
Number of pages10
ISBN (Electronic)9781950737482
DOIs
Publication statusPublished - 2020
Externally publishedYes
Event57th Annual Meeting of the Association for Computational Linguistics, ACL 2019 - Florence, Italy
Duration: 28 Jul 20192 Aug 2019

Publication series

NameACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference57th Annual Meeting of the Association for Computational Linguistics, ACL 2019
Country/TerritoryItaly
CityFlorence
Period28/07/192/08/19

Bibliographical note

Publisher Copyright:
© 2019 Association for Computational Linguistics

Fingerprint

Dive into the research topics of 'On the robustness of self-attentive models'. Together they form a unique fingerprint.

Cite this