Achieving Adversarial Robustness via Sparsity


Network parameter numbers vs adversarial robustness


Authors
Ningyi Liao, Shufan Wang, Liyao Xiang, Nanyang Ye, Shuo Shao, Pengzhi Chu
Publication
Machine Learning 111: 685–711
Type
Journal article

Abstract

Network pruning has been known to produce compact models without much accuracy degradation. However, how the pruning process affects a network’s robustness and the working mechanism behind remain unresolved. In this work, we theoretically prove that the sparsity of network weights is closely associated with model robustness. Through experiments on a variety of adversarial pruning methods, image-classification models and datasets, we find that weights sparsity will not hurt but improve robustness, where both weights inheritance from the lottery ticket and adversarial training improve model robustness in network pruning. Based on these findings, we propose a novel adversarial training method called inverse weights inheritance, which imposes sparse weights distribution on a large network by inheriting weights from a small network, thereby improving the robustness of the large network.


Citation
Ningyi Liao, Shufan Wang, Liyao Xiang, Nanyang Ye, Shuo Shao, Pengzhi Chu. "Achieving Adversarial Robustness via Sparsity." Machine Learning 111: 685–711. 2022.