Unlabeled Data for Adversarial Robustness
Published:
Effect of adversarial perturbations on natural input - Misclassification of an image (which hasn’t changed in human’s perspective)
This blog post talks about using Unlabeled data for improving Adversarial Robustness in deep neural networks. This post is a summary of the following works: -
- Using Pre-training can improve model robustness (ICML 2019)
- Unlabeled data improves adversarial robustness (NeurIPS 2019)
- Are labels required for improving adversarial robustness? (NeurIPS 2019)
Brief Precursor to using unlabeled data for adversarial robustness
Adversarial Robustness, in general, deals with the question whether we can develop classifiers that are robust to (test time) perturbations of their input, by an adversary intending to fool the classifier. This is of prime importance in critical applications of deep learning like cancer recognition systems, self-driving cars etc. where the scope of error is almost nil. In the recent years there has been significant developments in creating both adversaries and defenses against them.
Schmidt et al had showed in their work that there is a sample complexity gap in achieving the same robust accuracy as clean accuracy for a classification task using cifar-10 as the dataset.
Hence from his work we can conclude that there is a need for additional data for improving adversarial robustness. This worked served as a motivation to the papers we will be discussing in this blog. The following researchers started wondering how to make up for this additional data to bridge the sample complexity gap.
Using Pre-Training Improves Adversarial robustness
This paper introduces the concept of Adversarial Pre-training. This is based on the following concepts: -
- The problem of requirement of more task specific data can be solved using pre-training (a typical transfer learning scenario)
- Data from a different distribution can be beneficial for a different task (Huh et al)
Method used
- Adversarial pre-training on downsampled (to 32X32 size) ImageNet(1000 class) dataset with 10 step PGD with eps = 8/255 (l-infinity).
- Fine tuning with Cifar-10 dataset using pgd-10 with eps = 8/255 for 5 epochs
- Finally, evaluating with cifar-10 using pgd-20 with eps = 8/255 (l-infinity) Note - They use wrn-28-10 for all their experiments
Results
From the results obtained by them we can observe that the clean accuracy has almost remained the same whereas the adversarial accuracy has significantly increased (by 12%). Hence from this work we can conclude that adversarial features can robustly transfer across data distributions.
Unlabeled Data improves Adversarial robustness
This paper addresses the following questions -
- How can we account for additional data for improving robustness?
- How to get additional labelled data? Labelling may be an expensive process
The solution to the questions is a Semi-Supervised Adversarial Training Algorithm
Here they propose an algorithm Robust Self Training (RST) which is based on : -
- Taking unlabeled data and generating pseudo labels from them (using a network pre-trained with the labeled data)
- Mixing the unlabeled data and the labeled data in a definite proportion.
- Performing adversarial training (TRADES) on this dataset.
The reasoning for using unlabeled data for bridging the sample complexity gap is: -
- Labeling Data is generally an expensive and tedious process.
- Adversarial Robustness requires the predictions to be stable around naturally occurring inputs. Achieving this doesn’t really require labels.
RST Algorithm pseudocode
Here, Lstandard is the cross-entropy loss and Lrobust is the KL loss (as used in TRADES)
Note - Here cifar-10 is used as the labeled dataset containing 50K training samples. The unlabeled data is procured from the 80M Tiny Images dataset following a definite procedure. Total 500K unlabeled data is procured.
Results
The results are reported using wrn-28-10 with learning_scheduler cosine and unsupervised_fraction - 0.5 (i.e. each epoch contains half labeled and half unlabeled data)
We can observe that the RST model performs better than the other models on all the attacks, also it gets the highest accuracy on clean samples as well (89.7%)
Are Labels Required for improving Adversarial Robustness
Like the previous work, this work also focuses on using unlabeled data for improving adversarial robustness.
The main contributions of this work are: -
- Proposed UAT (Unsupervised Adversarial Training) method to make use of unlabeled data.
- Proved that unlabeled data can be competitive to labelled data for bridging the sample complexity gap in Schmidt et al.
- New state-of-art on CIFAR-10 using uncurated unlabeled data.
The motivation behind this work is: -
- Labelled data is expensive
- Adversarial robustness depends on the smoothness of the classifier which can be determined by unlabeled data.
- Only a small amount of labelled data is needed for standard generalization.
Here, they propose three variants of the UAT algorithm and experiment with all three of them.
These loss notations are useful for understanding the pseudocodes below. As from the losses, both the losses are making use of the the adversarial perturbed input (by considering the l-infinity norm ball)
Algorithm UAT-OT
Algorithm UAT-FT
Algorithm UAT++
Note the UAT++ algorithm is very similar to the RST algorithm of the previous paper except that in UAT++ both the losses (entropy and KL) are making use of the the adversarial perturbed input (by considering the l-infinity norm ball), whereas in RST only KL minimizes with adversarial input.
Also, here the unlabeled dataset is taken from the same labeled dataset but without considering the labels.
Here, we can see that the UAT++ algorithm performs as good as a supervised-oracle (a model which is trained with labels for the unlabeled data also). This proves that unlabeled data is competitive to labeled data for adversarial robustness.
Conclusion
Hence we have seen how unlabeled data can be used to improve adversarial robustness by studying the above three papers. For more details regarding this it is recommended to read the original papers. For a general idea about adversarial robustness one can refer here.
Stay tuned for more ML and DL content!