crowdAI is shutting down - please read our blog post for more information
Pitting machine vision models against adversarial attacks.
EPFL Digital Epidemiology Lab
over 2 years ago
I would have expected the score of a model on a particular test sample would be primarily based on the minimum distance of all successful adversarial examples. This seems to be closer to measuring true model robustness.
It makes sense that the score should be greater when all attacks fail, but in this case it seems to me that the failed attempts at larger distances are possibly more indicative of robustness than the failures at small distances - so I don’t understand using minimum distance of failed attempt.
over 2 years ago |
Dear Brandon, sorry for the late reply, we are primarily monitoring the gitter channel at the moment. Anyhow, I am not sure I completely get your question as we are taking the minimum distance across all attacks. Only in the case that really all attacks fail we will resort to some kind of gray-scale default adversarial (which has a fairly large distance to the original sample). However, I don’t expect this to ever becoming relevant in question.
about 2 years ago |
Thanks Matthias, and thanks for pointing out the gitter channel. Your answer and that discussion helped me to clear up my question. I was confused by the Sentence in the evaluation criterion:
“If a model misclassifies a sample then the minimum adversarial distance is registered as zero for this sample.”
I was thinking of ‘sample’ here incorrectly as the attempted adversarial example. Anyway I get it now.