BadNets: Identifying vulnerabilities in the machine learning model supply chain

joe_the_user · on Oct 19, 2017

The thing with neural nets is that once one gets over being impressed by their ability to do an impressive variety of things that previously weren't doable, and consider them as simply applications that have to stand with other applications, they look like ... the worst applications ever.

They're ... undocumented and have specified behavior (leading to serious, not-hypothetical problems like racial bias). Their maintenance is unspecified or nonexistent. They inherently have security holes in a multitude of fashions. Their performance isn't guaranteed and measured only against their own benchmarks, etc...

This kind of thing is much more of a problem when they have to be a system used by a large institution as just part of its operations (a loan evaluation program or a recidivism indicator) rather than a standard alone application (a program to play the game of go). But in the former capacity, a lot more work needs to be done.

hbk1966 · on Oct 19, 2017

This is basically one of my problems with self driving cars and putting your life in the hands of neural networks. It could theoretically be possible to find vulnerabilities in these systems. Someone could possible create some sort of genetic algorithm to evolve a sign that could cause a car to veer off the road and crash.

Maybe i'm just scare mongering or worrying about nothing. I just don't really feel comfortable putting my life in the hands of something that no one knows exactly how it works.

uoaei · on Oct 22, 2017

I have a hypothesis that these errors in interpretation stem mainly from the max pooling layers that occurs in nearly all CNNs today. Simply cutting out a large fraction (at least 75%!) of the information passing through each pooling layer is making the models quite brittle.

That, and obviously it would be a big help to be able to interpret meaningful symbols (e.g. words) from the images rather than just recognize a certain arbitrary class of pattern activations. That is pretty hard, of course, but IMO essential for the upcoming self-driving car revolution, if it is to succeed.

G. Hinton has some ideas though they aren't very fleshed out yet.[1]

[1] https://www.youtube.com/watch?v=rTawFwUvnLE

LeicaLatte · on Oct 19, 2017

This is a good introduction to possible attack vectors in training models. Thank you for sharing!

Dowwie · on Oct 19, 2017

Galen Erso.