Deep Learning Systems

smokel · on Aug 12, 2023

I really like the sneers about some terrible naming, e.g. in the slide on "The self-attention operation":

> “keys”, “queries”, “values”, in one of the least-meaningful semantic designations we have in deep learning

And in the context of LSTMs:

> throwing in some other names, like “forget gate”, “input gate”, “output gate” for good measure

This makes me feel more confident about actually understanding these topics. Before, I was totally misled by the awkward terminology.

sva_ · on Aug 12, 2023

It gets even worse when you see how the ML community likes to bastardize terms from neuroscience.

But I think it is good to have memorable names that one can use to talk about the concepts verbally.

jeron · on Aug 12, 2023

simply operator overloading

visarga · on Aug 13, 2023

> “forget gate”, “input gate”, “output gate”

These are legit:

cell_h = cell_(h-1) * forget_gate + tanh(linear(input_h)) * input_gate

out_h = cell_h * output_gate

see? forget_gate masks input by multiplying with numbers in [0, 1], input_gate controls the external input, output_gate controls the output of course

p1esk · on Aug 13, 2023

The names make sense to me once I understood what they represent. How else would you like to call them?

smokel · on Aug 13, 2023

I understand the terminology in the context of the original papers, but to me the metaphors don't seem to generalize well, or at least not in the suggested direction.

This is probably just an unfortunate situation, due to progressive understanding. Pointing this out in the slides gave me a sense of relief.

Personally, I wouldn't put names to every minor part of an algorithm or formula that was discovered to work empirically. But then again, I haven't discovered anything, and the authors of the respective papers certainly deserve some credit for their inventions!

0cf8612b2e1e · on Aug 14, 2023

I also enjoy how ML likes to twist statistical nomenclature enough to be irritating.

chefandy · on Aug 12, 2023

I find open educational resources just so dang heartwarming.

junrushao1994 · on Aug 12, 2023

This is a particular unique course offering introduction on ML compilation and deployment :)

__rito__ · on Aug 12, 2023

I really liked the style of the instructor (Kolter), and the reason I like this course very much is because each lecture is followed by an implementation video along with the notebook file.

In most Deep Learning courses, the implementation is left to TAs and neither recorded nor made available. This course is an exception. Another bright exception is NYU Deep Learning course [0] by Yann LeCun and Alfredo Canziani. In that course, too, all recitations ("Practica") are recorded and made available. And Canziani is a great teacher.

[0]: https://atcold.github.io/pytorch-Deep-Learning

lyapunova · on Aug 12, 2023

I also really like the instructor for this course!

Seems like he really cares. I looked him up and I guess he was a student of Andrew Ng (the legendary ML lecturer!!) so it makes sense.

borg16 · on Aug 12, 2023

Thanks, this is a wonderful recommendation

a-dub · on Aug 12, 2023

very nice! I'm also a big fan of the vu amsterdam deep learning lectures on youtube. less systems focus but a really good intro to modern neural network based ML.

amelius · on Aug 12, 2023

Link: https://dlvu.github.io/

osti · on Aug 13, 2023

Are they going to offer this course again this Fall? I think you have to sign up in order to submit assignments so I'd like it if they offered it again soon.

gdiamos · on Aug 12, 2023

Excited to see the MLSys growing.

Deep learning methods are so computationally intensive, many advances have come through new algorithms and optimization methods.

abalaji · on Aug 12, 2023

Took this class when it was offered for the first time when I was at CMU--it's a really great course and well organized!

quickthrower2 · on Aug 13, 2023

This looks good as it cover’s hardware acceleration which is a gap in my knowledge I would like to start to understand