Skip to content

xiaosen-wang/Adversarial-Examples-Paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 

Repository files navigation

The Papers of Adversarial Examples

We have given a complete list of adversarial examples here where the raw data is here.

A Complete List of All (arXiv) Adversarial Example Papers

Adversarial Examples in Computer Vision

Newest paper

Adversarial Attack

[1] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow and Rob Fergus. Intriguing properties of neural networks. ICLR 2014.

Gradented-Based Attack

[1] Ian J. Goodfellow, Jonathon Shlens and Christian Szegedy. Explaining and Harnessing Adversarial Examples. ICLR 2015.

[2] Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L Yuille. Improving transferability of adversarial examples with input diversity. CVPR 2019.

[3] Lei Wu, Zhanxing Zhu, Cheng Tai and Weinan E. Enhancing the Transferability of Adversarial Examples with Noise Reduced Gradient. ICLR 2018 rejected.

[4] Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu and Jianguo Li. Boosting Adversarial Attacks with Momentum. CVPR 2018.

[5] Lianli Gao, Qilong Zhang, Jingkuan Song, Xianglong Liu and Heng Tao Shen. Patch-wise Attack for Fooling Deep Neural Network. ECCV 2020.

GAN-Based Attack

[1] Hyrum S. Anderson, Jonathan Woodbridge and Bobby Filar. DeepDGA: Adversarially-Tuned Domain Generation and Detection AISec 2016.

[2] Chaowei Xiao, Bo Li, Jun-Yan Zhu, Warren He, Mingyan Liu and Dawn Song. Generating Adversarial Examples with Adversarial Networks. IJCAI 2018.

[3] Yang Song, Rui Shu, Nate Kushman and Stefano Ermon. Constructing Unrestricted Adversarial Examples with Generative Models. NeurIPS 2018.

[4] Xiaosen Wang, Kun He and John E. Hopcroft. AT-GAN: A Generative Attack Model for Adversarial Transferring on Generative Adversarial Nets. arXiv Preprint arXiv:1904.07793 2019.

[5] Tao Bai, Jun Zhao, Jinlin Zhu, Shoudong Han, Jiefeng Chen and Bo Li. AI-GAN: Attack-Inspired Generation of Adversarial Examples. arXiv Preprint arXiv:2002.02196 2020.

Transferability

full list

[1] Jiadong Lin, Chuanbiao Song, Kun He, Liwei Wang, John E. Hopcroft. Nesterov Accelerated Gradient and Scale Invariance for Improving Transferability of Adversarial Examples. ICLR 2020.

[2] Weibin Wu, Yuxin Su, Xixian Chen, Shenglin Zhao, Irwin King, Michael R. Lyu, Yu-Wing Tai. Boosting the Transferability of Adversarial Samples via Attention. CVPR 2020.

[3] Nathan Inkawhich, Kevin Liang, Lawrence Carin and Yiran Chen. Transferable Perturbations of Deep Feature Distributions. ICLR 2020.

[4] Kaizhao Liang, Jacky Y. Zhang, Oluwasanmi Koyejo, Bo Li. Does Adversarial Transferability Indicate Knowledge Transferability?. arXiv Preprint arXiv:2006.14512 2020.

[5] Junhua Zou, Zhisong Pan, Junyang Qiu, Xin Liu, Ting Rui, Wei Li. Improving the Transferability of Adversarial Examples with Resized-Diverse-Inputs, Diversity-Ensemble and Region Fitting. ECCV 2020.

[6] Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, Alan Yuille. Improving Transferability of Adversarial Examples with Input Diversity. CVPR 2019.

[7] Yinpeng Dong, Tianyu Pang, Hang Su, Jun Zhu. Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks. CVPR 2019.

[8] Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, Jianguo Li. Boosting Adversarial Attacks with Momentum. CVPR 2018.

[9] Xiaosen Wang, Xuanran He, Jingdong Wang, Kun He. Admix: Enhancing the Transferability of Adversarial Attacks. ICCV 2021.

[10] Xiaosen Wang, Kun He. Enhancing the Transferability of Adversarial Attacks through Variance Tuning. CVPR 2021.

[11] Xiaosen Wang, Jiadong Lin, Han Hu, Jingdong Wang, Kun He. Boosting Adversarial Transferability through Enhanced Momentum. BMVC 2021.

[12] Weibin Wu, Yuxin Su, Michael R. Lyu, Irwin King. Improving the Transferability of Adversarial Samples With Adversarial Transformations. CVPR 2020.

[13] Zhibo Wang, Hengchang Guo, Zhifei Zhang, Wenxin Liu, Zhan Qin, Kui Ren. Feature Importance-aware Transferable Adversarial Attacks. ICCV 2021.

Hard Label Attack

[1] Wieland Brendel, Jonas Rauber, Matthias Bethge. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. ICLR 2018.

[2] Andrew Ilyas, Logan Engstrom, Anish Athalye, Jessy Lin. Black-box Adversarial Attacks with Limited Queries and Information. ICML 2018.

[3] Minhao Cheng, Thong Le, Pin-Yu Chen, Jinfeng Yi, Huan Zhang, Cho-Jui Hsieh. Query-Efficient Hard-label Black-box Attack:An Optimization-based Approach. ICLR 2019.

[4] Yinpeng Dong, Hang Su, Baoyuan Wu, Zhifeng Li, Wei Liu, Tong Zhang, Jun Zhu. Efficient Decision-based Black-box Adversarial Attacks on Face Recognition. CVPR 2019.

[5] Yujia Liu, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard. A geometry-inspired decision-based attack. ICCV 2019.

[6] Jianbo Chen, Michael I. Jordan, Martin J. Wainwright. HopSkipJumpAttack: A Query-Efficient Decision-Based Attack. SP 2020.

[7] Minhao Cheng, Simranjit Singh, Patrick Chen, Pin-Yu Chen, Sijia Liu, Cho-Jui Hsieh. Sign-OPT: A Query-Efficient Hard-label Adversarial Attack. ICLR 2020.

[8] Ali Rahmati, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, Huaiyu Dai. GeoDA: a geometric framework for black-box adversarial attacks. CVPR 2020.

[9] Huichen Li, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, Bo Li. QEBA: Query-Efficient Boundary-Based Blackbox Attack. CVPR 2020.

[10] Weilun Chen, Zhaoxiang Zhang, Xiaolin Hu, Baoyuan Wu. Boosting Decision-based Black-box Adversarial Attacks with Random Sign Flip. ECCV 2020.

[11] Thibault Maho, Teddy Furon, Erwan Le Merrer. SurFree: a fast surrogate-free black-box attack. CVPR 2021.

[12] Huichen Li, Linyi Li, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, Bo Li. Nonlinear Projection Based Gradient Estimation for Query Efficient Blackbox Attacks. AISTATS 2021.

[13] Xiaosen Wang, Zeliang Zhang, Kangheng Tong, Dihong Gong, Kun He, Zhifeng Li, Wei Liu. Triangle Attack: A Query-efficient Decision-based Adversarial Attack. ECCV 2022.

[14] Satya Narayan Shukla, Anit Kumar Sahu, Devin Willmott, J. Zico Kolter. Simple and Efficient Hard Label Black-box Adversarial Attacks in Low Query Budget Regimes. KDD 2021.

Unrestricted Adversarial Examples

[1] Yang Song, Rui Shu, Nate Kushman and Stefano Ermon. Constructing Unrestricted Adversarial Examples with Generative Models. NeurIPS 2018.

[2] Xiaosen Wang, Kun He and John E. Hopcroft. AT-GAN: A Generative Attack Model for Adversarial Transferring on Generative Adversarial Nets. arXiv Preprint arXiv:1904.07793.

Black-box attacks

[1] Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, Cho-Jui Hsieh. ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models. ACM Workshop on Artificial Intelligence and Security (AISec) 2017.

[2] Andrew Ilyas, Logan Engstrom, Anish Athalye, Jessy Lin. Black-box Adversarial Attacks with Limited Queries and Information. ICML 2018.

[3] Andrew Ilyas, Logan Engstrom, Aleksander Madry. Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors. ICLR 2019.

[4] Arjun Nitin Bhagoji, Warren He, Bo Li, Dawn Song. Practical Black-box Attacks on Deep Neural Networks using Efficient Query Mechanisms. ECCV 2018.

[5] Shuyu Cheng, Yinpeng Dong, Tianyu Pang, Hang Su and Jun Zhu. Improving Black-box Adversarial Attacks with a Transfer-based Prior. NeurIPS 2019.

Hard-label attacks

[1] Wieland Brendel, Jonas Rauber and Matthias Bethge. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. ICLR 2018.

[2] Andrew Ilyas, Logan Engstrom, Anish Athalye and Jessy Lin. Black-box Adversarial Attacks with Limited Queries and Information. ICML 2018.

[3] Yinpeng Dong, Hang Su, Baoyuan Wu, Zhifeng Li, Wei Liu, Tong Zhang and Jun Zhu. Efficient Decision-based Black-box Adversarial Attacks on Face Recognition. CVPR 2019.

[4] Minhao Cheng, Thong Le, Pin-Yu Chen, Jinfeng Yi, Huan Zhang and Cho-Jui Hsieh. Query-Efficient Hard-label Black-box Attack:An Optimization-based Approach. ICLR 2019.

[5] Minhao Cheng, Simranjit Singh, Patrick Chen, Pin-Yu Chen, Sijia Liu and Cho-Jui Hsieh. Sign-OPT: A Query-Efficient Hard-label Adversarial Attack. ICLR 2020.

[6] Weilun Chen, Zhaoxiang Zhang, Xiaolin Hu, and Baoyuan Wu. Boosting Decision-based Black-box Adversarial Attacks with Random Sign Flip. ECCV 2020.

[7] Yujia Liu, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard. A Geometry-Inspired Decision-Based Attack. ICCV 2019.

[8] Thibault Maho, Teddy Furon, Erwan Le Merrer. SurFree: a fast surrogate-free black-box attack. arXiv Preprint arXiv:2011.12807.

Others

[1] Xiaoyi Dong, Jiangfan Han, Dongdong Chen, Jiayang Liu, Huanyu Bian, Zehua Ma, Hongsheng Li, Xiaogang Wang, Weiming Zhang and Nenghai Yu. Robust Superpixel-Guided Attentional Adversarial Attack. CVPR 2020.

[2] Linxi Jiang, Xingjun Ma, Zejia Weng, James Bailey and Yu-Gang Jiang. Imbalanced Gradients: A New Cause of Overestimated Adversarial Robustness. arXiv Preprint arXiv:2006.13726.

Unrecognized Images

[1] Anh Nguyen, Jason Yosinski and Jeff Clune. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. CVPR 2015.

Adversarial Defense

Adversarial Training

[1] Ian J. Goodfellow, Jonathon Shlens and Christian Szegedy. Explaining and Harnessing Adversarial Examples. ICLR 2015.

[2] Runtian Zhai, Tianle Cai, Di He, Chen Dan, Kun He, John Hopcroft and Liwei Wang. Adversarially Robust Generalization Just Requires More Unlabeled Data arXiv Preprint arXiv:1906.00555.

[3] Yair Carmon, Aditi Raghunathan, Ludwig Schmidt, Percy Liang and John C. Duchi. Unlabeled Data Improves Adversarial Robustness. arXiv Preprint arXiv:1905.13736.

[4] Jonathan Uesato, Jean-Baptiste Alayrac, Po-Sen Huang, Robert Stanforth, Alhussein Fawzi and Pushmeet Kohli. Are Labels Required for Improving Adversarial Robustness?. arXiv Preprint arXiv:1905.13725.

[5] Chuanbiao Song, Kun He, Liwei Wang and John E. Hopcroft. Improving the Generalization of Adversarial Training with Domain Adaptation . ICLR 2019.

[6] Hang Yu, Aishan Liu, Xianglong Liu, Gengchao Li, Ping Luo, Ran Cheng, Jichen Yang and Chongzhi Zhang. PDA: Progressive Data Augmentation for General Robustness of Deep Neural Networks. arXiv Preprint arXiv:1909.04839.

[7] Chuanbiao Song, Kun He, Jiadong Lin, Liwei Wang and John E. Hopcroft. Robust Local Features for Improving the Generalization of Adversarial Training. ICLR 2020.

[8] Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, Michael I. Jordan. Theoretically Principled Trade-off between Robustness and Accuracy. ICML 2019.

[9] Yuanhao Xiong and Cho-Jui Hsieh. Improved Adversarial Training via Learned Optimizer. arXiv Preprint arXiv:2004.12227.

[10] Pranjal Awasthi, Natalie Frank and Mehryar Mohri. Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks. arXiv Preprint arXiv:2004.13617.

[11] Yisen Wang, Difan Zou, Jinfeng Yi, James Bailey, Xingjun Ma and Quanquan Gu. Improving Adversarial Robustness Requires Revisiting Misclassified Examples. ICLR 2020.

[12] Chang Xiao, Peilin Zhong, Changxi Zheng. Enhancing Adversarial Defense by k-Winners-Take-All. ICLR 2020.

[13] Saehyung Lee, Hyungyu Lee and Sungroh Yoon. Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization. CVPR 2020.

[14] Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui and Ruitong Huang. MMA Training: Direct Input Space Margin Maximization through Adversarial Training. ICLR 2020.

[15] Harini Kannan, Alexey Kurakin and Ian Goodfellow. Adversarial Logit Pairing. arXiv Preprint arXiv:1803.06373.

[16] Cihang Xie, Mingxing Tan, Boqing Gong, Alan Yuille and Quoc V. Le. Smooth Adversarial Training. arXiv Preprint arXiv:2006.14536.

[17] Anh Bui, Trung Le, He Zhao, Paul Montague, Olivier deVel, Tamas Abraham and Dinh Phung. Improving Adversarial Robustness by Enforcing Local and Global Compactness. ECCV 2020.

[18] David Stutz, Matthias Hein and Bernt Schiele. Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks. ICML 2020.

[19] Tianyu Pang, Chao Du, and Jun Zhu. Max-Mahalanobis Linear Discriminant Analysis Networks. ICML 2018.

[20] Ali Shafahi, Mahyar Najibi, Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S. Davis, Gavin Taylor, Tom Goldstein. Adversarial Training for Free!. NeurIPS 2019.

[21] Yinpeng Dong, Zhijie Deng, Tianyu Pang, Hang Su, Jun Zhu. Adversarial Distributional Training for Robust Deep Learning. NeurIPS 2020.

[22] Alex Lamb, Vikas Verma, Juho Kannala, Yoshua Bengio. [Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy]. ACM AISec 2019.

[23] Alfred Laugros, Alice Caplier, Matthieu Ospici. Addressing Neural Network Robustness with Mixup and Targeted Labeling Adversarial Training. ECCV 2019.

[24] Saehyung Lee, Hyungyu Lee, Sungroh Yoon. Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization. CVPR 2020.

[25] Tao Bai, Jinqi Luo, Jun Zhao, Bihan Wen, Qian Wang. Recent Advances in Adversarial Training for Adversarial Robustness. arXiv Preprint arXiv:2102.01356 2021.

[26] Leslie Rice, Eric Wong, J. Zico Kolter. Overfitting in adversarially robust deep learning. ICML 2020.

[27] Jason Bunk, Srinjoy Chattopadhyay, B. S. Manjunath, Shivkumar Chandrasekaran. Adversarially Optimized Mixup for Robust Classification. arXiv Preprint arXiv:2103.11589 2021.

[28] Zuxuan Wu, Tom Goldstein, Larry S. Davis, Ser-Nam Lim. THAT: Two Head Adversarial Training for Improving Robustness at Scale. arXiv Preprint arXiv:2103.13612 2021.

[29] Xiaosen Wang, Bhavya Kailkhura, Krishnaram Kenthapadi, Bo Li. I-PGD-AT: Efficient Adversarial Training via Imitating Iterative PGD Attack. openreview 2021.

[30] Xiaosen Wang, Chuanbiao Song, Kun He. Multi-stage Optimization based Adversarial Training. arXiv Preprint arXiv:2106.15357 2021.

Fast Adversarial Training

[1] Eric Wong, Leslie Rice, J. Zico Kolter. Fast is better than free: Revisiting adversarial training. ICLR 2020.

[2] Maksym Andriushchenko, Nicolas Flammarion. Understanding and Improving Fast Adversarial Training. NeurIPS 2020.

[3] Hoki Kim, Woojin Lee, Jaewook Lee. Understanding Catastrophic Overfitting in Single-step Adversarial Training. AAAI 2021.

GAN-Based Defense

[1] Pouya Samangouei, Maya Kabkab and Rama Chellappa. Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models. ICLR 2018.

[2] Yang Song, Taesup Kim, Sebastian Nowozin, Stefano Ermon and Nate Kushman. PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples ICLR 2018.

[3] Guoqing Jin, Shiwei Shen, Dongming Zhang, Feng Dai and Yongdong Zhang. APE-GAN: Adversarial Perturbation Elimination with GAN. ICASSP 2019.

Certified defense

[1] Eric Wong and J. Zico Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. ICML 2017.

[2] Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann and Pushmeet Kohli. Scalable Verified Training for Provably Robust Image Classification. ICCV 2019.

[3] Jeremy M Cohen, Elan Rosenfeld and J. Zico Kolter. Certified Adversarial Robustness via Randomized Smoothing. ICML 2019.

[4] Guang-He Lee, Yang Yuan, Shiyu Chang and Tommi S. Jaakkola. Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers. NeurIPS 2019.

Others

[1] Matthew J. Roos. Utilizing a null class to restrict decision spaces and defend against neural network adversarial attacks. arXiv Preprint arXiv:2002.10084 2020.

[2] Alvin Chan, Yi Tay, Yew Soon Ong and Jie Fu. Jacobian Adversarially Regularized Networks for Robustness. ICLR 2020.

[3] Christian Etmann, Sebastian Lunz, Peter Maass and Carola-Bibiane Schönlieb. On the Connection Between Adversarial Robustness and Saliency Map Interpretability. ICML 2019.

[4] Zhun Deng, Linjun Zhang, Amirata Ghorbani and James Zou. Improving Adversarial Robustness via Unlabeled Out-of-Domain Data. arXiv Preprint arXiv:2006.08476 2020.

[5] Tianyu Pang, Kun Xu, Yinpeng Dong, Chao Du, Ning Chen and Jun Zhu. Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness. ICLR 2020.

Others

[1] Haohan Wang, Xindi Wu, Zeyi Huang, Eric P. Xing. High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks. CVPR 2020.

[2] Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin and David Lopez-Paz. Mixup: Beyond Empirical Risk Minimization. ICLR 2018.

[3] Ludwig Schmidt, Shibani Santurkar, Dimitris Tsipras, Kunal Talwar and Aleksander Mądry. Adversarially Robust Generalization Requires More Data. NeurIPS 2018.

[4] Tianyuan Zhang and Zhanxing Zhu. Interpreting Adversarially Trained Convolutional Neural Networks. ICML 2019.

[5] Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A. Wichmann and Wieland Brendel. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. ICLR 2019.

[6] Dong Yin, Raphael Gontijo Lopes, Jonathon Shlens, Ekin D. Cubuk and Justin Gilmer. A Fourier Perspective on Model Robustness in Computer Vision. NeurIPS 2019.

[7] Chengzhi Mao, Amogh Gupta, Vikram Nitin, Baishakhi Ray, Shuran Song, Junfeng Yang and Carl Vondrick. Multitask Learning Strengthens Adversarial Robustness. arXiv Preprint arXiv:2007.07236.

[8] Xiao Wang, Siyue Wang, Pin-Yu Chen, Yanzhi Wang, Brian Kulis, Xue Lin and Peter Chin. Protecting Neural Networks with Hierarchical Random Switching: Towards Better Robustness-Accuracy Trade-off for Stochastic Defenses. IJCAI 2019.

[9] Matteo Terzi, Alessandro Achille, Marco Maggipinto, Gian Antonio Susto. Adversarial Training Reduces Information and Improves Transferability. arXiv Preprint arXiv:2007.11259 2020.

Adversarial Examples in Natural Language Processing

Survey

[1] Wei Emma Zhang, Quan Z. Sheng, Ahoud Alhazmi and Chenliang Li. Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey. arXiv Preprint arXiv:1901.06796 2019.

[2] Wenqi Wang, Lina Wang, Benxiao Tang, Run Wang and Aoshuang Ye. Towards a Robust Deep Neural Network in Text Domain A Survey. arXiv Preprint arXiv:1902.07285 2019.

Newest paper

Adversarial Attack

[1] John X. Morris, Eli Lifland, Jin Yong Yoo and Yanjun Qi. TextAttack: A Framework for Adversarial Attacks in Natural Language Processing. arXiv Preprint axXiv:2005.05909 2020.

Character-Level

[1] Javid Ebrahimi, Anyi Rao, Daniel Lowd and Dejing Dou. HotFlip: White-Box Adversarial Examples for Text Classification. ACL 2018.

[2] Ji Gao, Jack Lanchantin, Mary Lou Soffa and Yanjun Qi. Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. IEEE S&P workshop 2018.

Word-Level

[1] Nicolas Papernot, Patrick McDaniel, Ananthram Swami and Richard Harang. Crafting Adversarial Input Sequences for Recurrent Neural Networks. MILCOM 2016.

[2] Volodymyr Kuleshov, Shantanu Thakoor, Tingfung Lau and Stefano Ermon. Adversarial Examples for Natural Language Classification Problems . ICLR 2018 rejected.

[3] Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava and Kai-Wei Chang. Generating Natural Language Adversarial Examples. EMNLP 2018.

[4] Shuhuai Ren, Yihe Deng, Kun He and Wanxiang Che. Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. ACL 2019.

[5] Huangzhao Zhang, Hao Zhou, Ning Miao and Lei Li. Generating Fluent Adversarial Examples for Natural Languages. ACL 2019.

[6] Yi-Ting Tsai, Min-Chu Yang and Han-Yu Chen. Adversarial Attack on Sentiment Classification. ACL workshop 2019.

[7] Samuel Barham and Soheil Feizi. Interpretable Adversarial Training for Text arXiv Preprint arXiv:1905.12864 2019.

[8] Di Jin, Zhijing Jin, Joey Tianyi Zhou and Peter Szolovits. Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment arXiv Preprint arXiv:1907.11932 2019.

[9] Xiaosen Wang, Hao Jin and Kun He. Natural Language Adversarial Attacks and Defenses in Word Level. arXiv Preprint arXiv:1909.06723 2019.

[10] Suranjana Samanta and Sameep Mehta. Towards Crafting Text Adversarial Samples. arXiv Preprint arXiv:1707.02812 2017.

[11] Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu and Maosong Sun. Word-level Textual Adversarial Attacking as Combinatorial Optimizatio. ACL 2020.

[12] Xiaosen Wang, Yichen Yang, Yihe Deng and Kun He. Adversarial Training with Fast Gradient Projection Method against Synonym Substitution based Text Attacks. AAAI 2021.

[13] Linyang Li, Yunfan Shao, Demin Song, Xipeng Qiu and Xuanjing Huang. Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces. arXiv Preprint arXiv:2012.14769 2020.

[14] Bushra Sabir, M. Ali Babar, Raj Gaire. ReinforceBug: A Framework to Generate Adversarial Textual Examples. NAACL 2021.

[15] Xuanli He, Lingjuan Lyu, Qiongkai Xu, Licaho Sun. Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!. NAACL 2021.

[16] Zhao Meng, Roger Wattenhofer. A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples. COLING 2020.

[17] Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi. A Strong Baseline for Query Efficient Attacks in a Black Box Setting. EMNLP 2021.

[18] Yangyi Chen, Jin Su, Wei Wei. Multi-granularity Textual Adversarial Attack with Behavior Cloning. EMNLP 2021.

[19] Shengcai Liu, Ning Lu, Cheng Chen, Ke Tang. Efficient Combinatorial Optimization for Word-level Adversarial Textual AttackarXiv Preprint arXiv:2109.02229 2021.

Both

[1] Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li and Wenchang Shi. Deep text classification can be fooled. IJCAI 2018.

[2] Jinfeng Li, Shouling Ji, Tianyu Du, Bo Li and Ting Wang. TextBugger: Generating Adversarial Text Against Real-world Applications. NDSS 2019.

Universal Adversarial Examples

[1] Di Li, Danilo Vasconcellos Vargas and Sakurai Kouichi. Universal Rules for Fooling Deep Neural Networks based Text Classification. CEC 2019.

[2] Melika Behjati, Seyed-Mohsen Moosavi-Dezfooli, Mahdieh Soleymani Baghshah and Pascal Frossard. Universal Adversarial Attacks on Text Classifiers. ICASSP 2019.

Adversarial Defense

Character-Level

[1] Danish Pruthi, Bhuwan Dhingra and Zachary C. Lipton. Combating Adversarial Misspellings with Robust Word Recognition. ACL 2019.

[2] Hui Liu, Yongzheng Zhang, Yipeng Wang, Zheng Lin, Yige Chen. Joint Character-level Word Embedding and Adversarial Stability Training to Defend Adversarial Text. AAAI 2020.

Word-Level

[1] Ishai Rosenberg, Asaf Shabtai, Yuval Elovici and Lior Rokach. Defense Methods Against Adversarial Examples for Recurrent Neural Networks. arXiv Preprint arXiv:1901.09963 2019.

[2] Xiaosen Wang, Hao Jin and Kun He. Natural Language Adversarial Attacks and Defenses in Word Level. arXiv Preprint arXiv:1909.06723 2019.

[3] Yi Zhou, Xiaoqing Zheng, Cho-Jui Hsieh, Kai-wei Chang and Xuanjing Huang. Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble. arXiv Preprint arXiv:2006.11627 2020.

[4] Xiaosen Wang, Yichen Yang, Yihe Deng and Kun He. Adversarial Training with Fast Gradient Projection Method against Synonym Substitution based Text Attacks. AAAI 2021.

[5] Rishabh Maheshwary, Saket Maheshwary and Vikram Pudi. Generating Natural Language Attacks in a Hard Label Black Box Setting. AAAI 2021.

[6] Chenglei Si, Zhengyan Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun. Better Robustness by More Coverage: Adversarial Training with Mixup Augmentation for Robust Fine-tuning. arXiv Preprint arXiv:2012.15699 2020.

[7] Xinshuai Dong, Xinshuai_Dong, Anh Tuan Luu, Rongrong Ji, Hong Liu. Towards Robustness Against Natural Language Word Substitutions. ICLR 2021.

[8] Jin Yong Yoo, Yanjun Qi. Towards Improving Adversarial Training of NLP Models. EMNLP Findings 2021.

[9] Rongzhou Bao, Jiayi Wang, Hai Zhao. Defending Pre-trained Language Models from Adversarial Word Substitutions Without Performance Sacrifice. ACL Findings 2021.

[10] Yichen Yang, Xiaosen Wang, Kun He. Robust Textual Embedding against Word-level Adversarial Attacks. UAI 2022.

Both

[1] Nestor Rodriguez and Sergio Rojas-Galeano. Shielding Google's language toxicity model against adversarial attacks. arXiv Preprint arXiv:1801.01828 2018.

Certified defense

[1] Robin Jia, Aditi Raghunathan, Kerem Göksel and Percy Liang. Certified Robustness to Adversarial Word Substitutions. EMNLP-IJCNLP 2019.

[2] Po-Sen Huang, Robert Stanforth, Johannes Welbl, Chris Dyer, Dani Yogatama, Sven Gowal, Krishnamurthy Dvijotham, Pushmeet Kohli. Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation. EMNLP-IJCNLP 2019.

[3] Jiehang Zeng, Xiaoqing Zheng, Jianhan Xu, Linyang Li, Liping Yuan, Xuanjing Huang. Certified Robustness to Text Adversarial Attacks by Randomized [MASK]. ACL Findings 2021.

Detection

[1] Yichao Zhou, Jyun-Yu Jiang, Kai-Wei Chang and Wei Wang. Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification. EMNLP-IJCNLP 2019.

[2] Maximilian Mozes, Pontus Stenetorp, Bennett Kleinberg, Lewis D. Griffin. Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples. EACL 2021.

[3] Xiaosen Wang, Yifeng Xiong, Kun He. Randomized Substitution and Vote for Textual Adversarial Example Detection. UAI 2022.

About

Paper list of Adversarial Examples

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published