Sure, I'll give it a shot -- feel free to email me if you have further questions...

Sure, I'll give it a shot -- feel free to email me if you have further questions, email is in my profile.

1. I think it makes sense to try them for any classification, regression, or feature extraction problem. They don't work all the time, sometimes you really don't need the extra depth--one hidden layer can be fine, and they can be pretty slow to train (even with GPU). I've also seen people try to build their own, implement it wrong, get bad results, then complain NNs don't work. So test for yourself, just make sure you're not doing it wrong.

2. It really depends. More is almost always better.

3. Training a bunch of models using Bayesian optimization to optimize the model hyperparameters (so you don't have to pick them) and putting the last few in an ensemble and averaging results is pretty close to out of the box. This is the workflow we use with ersatz.

4. Despite lunches not being free, you should probably use dropout. It's ridiculously good at preventing over fitting but can take longer to train (although there's been some work w/ "fast dropout" to speed it up)

5. GPU gets you ~40x speed up over CPU. So if you're using CPU and I'm using GPU, I can do in 1 day what would take you a month and a half. And then I might train for a week or more on GPU (I think the imagenet models were trained for a week or two, but not sure how many GPUs used). Otherwise, computational effort varies.

6. You use mini batches, so you load on as many samples as fit in GPU memory (with the model params) and then pull those into smaller batches. You rotate the "large batch" periodically. Neural networks can continue taking in new data and updating their model (online learning) and are particularly attractive for very large data sets.

General points: use GPU, don't build your own unless as an academic exercise, use dropout, test empirically on your own data. And check out Bayesian optimization of hyperparameters, I'm becoming more and more convinced it's better at picking them than human experts anyway.