To use the test set explicitly for evaluation is a deadly sin. When found out, you'd face serious damage to your reputation (Like Baidu did a few years ago). [1] What the decreasing performance results on the remade CIFAR-10 test set shows, is probably more akin to a subtle form of overfitting (Due to these datasets being around for a long time, the good results get published, and the bad results discarded, leading to a feedback loop). [2] It is also possible the original test set was closer in distribution to the train set, than the remade one. The ranks stay too consistent for test set evaluation cheating.
I also think the "do not trust saliency maps" is too strongly worded. The authors of that paper used adversarial techniques to change the saliency maps. Not just random noise or slight variation, but carefully crafted noise to attack saliency feature importance maps.
> For example, while it would be nice to have a CNN identify a spot on an MRI image as a malignant cancer-causing tumor, these results should not be trusted if they are based on fragile interpretation methods.
Interpretation methods are as fragile as the deep learning model itself, which is susceptible to adversarial images too. If you allow for scenario's with adversarial images, not only should you not trust the interpretation methods, but also the predictions themselves, destroying any pragmatic value left. It is hard to imagine a realistic threat scenario where MRI's are altered by an adversary, _before_ they are fed into a CNN. When such a scenario is realistic, all bets are off. It is much like blaming Google Chrome exposing passwords during an evil maid attack (when someone has access to your computer, they can do all sorts of nasty stuff, it is nearly impossible to guard against this). [3]
EDIT: meta(I liked the article. I do not want to argue it is wrong. It is difficult for me to start a thread without finding the one or two things to nitpick at, or to expand upon a point, but this article was already very resourceful)
As one of the authors of the "Interpretation of Neural Networks is Fragile" paper, I would agree with you.
To a certain extent, saliency maps can be perturbed even with random noise, but the more dramatic attacks (and certainly the targeted attacks, in which we move the saliency map from one region of the image to a specified another region of the image) require carefully-crafted adversarial perturbations.
>"It is hard to imagine a realistic threat scenario where MRI's are altered by an adversary, _before_ they are fed into a CNN."
what about when people in the hospital who have a patient that they suspect has cancer use the best machine to create that patients scans and tend to push patients that they think are ok to the older less good instrument? Or if they choose to utilise time on the best instrument for children?
What about when the MRI's done at night are done by one technician who uses a slightly different process from the technicians who created the MRI data set?
At the very least there is a significant risk of systematic error being introduced by these kind of bias, and as you say, it's really hard to guard against this, but if a classifier that I produce is used where this happens and people die... Well, whatever I feel I would be responsible.
I also think the "do not trust saliency maps" is too strongly worded. The authors of that paper used adversarial techniques to change the saliency maps. Not just random noise or slight variation, but carefully crafted noise to attack saliency feature importance maps.
> For example, while it would be nice to have a CNN identify a spot on an MRI image as a malignant cancer-causing tumor, these results should not be trusted if they are based on fragile interpretation methods.
Interpretation methods are as fragile as the deep learning model itself, which is susceptible to adversarial images too. If you allow for scenario's with adversarial images, not only should you not trust the interpretation methods, but also the predictions themselves, destroying any pragmatic value left. It is hard to imagine a realistic threat scenario where MRI's are altered by an adversary, _before_ they are fed into a CNN. When such a scenario is realistic, all bets are off. It is much like blaming Google Chrome exposing passwords during an evil maid attack (when someone has access to your computer, they can do all sorts of nasty stuff, it is nearly impossible to guard against this). [3]
[1] https://www.technologyreview.com/s/538111/why-and-how-baidu-...
[2] http://hunch.net/?p=22
[3] https://www.theguardian.com/technology/2013/aug/07/google-ch...
EDIT: meta(I liked the article. I do not want to argue it is wrong. It is difficult for me to start a thread without finding the one or two things to nitpick at, or to expand upon a point, but this article was already very resourceful)