You said it. I've been working on some ConvNets for object localization in an image over the past couple weeks and it took days to figure out why my network just seemed to be randomly guessing (50% accuracy).
In the end, it was a reduction of the training rate (with SGD) that made things work in what felt like magic.
I've started reading the deep learning text book from Ian Goodfellow now (http://www.deeplearningbook.org). Hoping a solid foundation will build some intuitions for reasoning about these hyper parameters.
In the end, it was a reduction of the training rate (with SGD) that made things work in what felt like magic.
I've started reading the deep learning text book from Ian Goodfellow now (http://www.deeplearningbook.org). Hoping a solid foundation will build some intuitions for reasoning about these hyper parameters.