I understand the terminology in the context of the original papers, but to me the metaphors don't seem to generalize well, or at least not in the suggested direction.
This is probably just an unfortunate situation, due to progressive understanding. Pointing this out in the slides gave me a sense of relief.
Personally, I wouldn't put names to every minor part of an algorithm or formula that was discovered to work empirically. But then again, I haven't discovered anything, and the authors of the respective papers certainly deserve some credit for their inventions!
I really liked the style of the instructor (Kolter), and the reason I like this course very much is because each lecture is followed by an implementation video along with the notebook file.
In most Deep Learning courses, the implementation is left to TAs and neither recorded nor made available. This course is an exception. Another bright exception is NYU Deep Learning course [0] by Yann LeCun and Alfredo Canziani. In that course, too, all recitations ("Practica") are recorded and made available. And Canziani is a great teacher.
very nice! I'm also a big fan of the vu amsterdam deep learning lectures on youtube. less systems focus but a really good intro to modern neural network based ML.
Are they going to offer this course again this Fall? I think you have to sign up in order to submit assignments so I'd like it if they offered it again soon.
> “keys”, “queries”, “values”, in one of the least-meaningful semantic designations we have in deep learning
And in the context of LSTMs:
> throwing in some other names, like “forget gate”, “input gate”, “output gate” for good measure
This makes me feel more confident about actually understanding these topics. Before, I was totally misled by the awkward terminology.