Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Deep Learning Systems (dlsyscourse.org)
253 points by __rito__ on Aug 12, 2023 | hide | past | favorite | 18 comments


I really like the sneers about some terrible naming, e.g. in the slide on "The self-attention operation":

> “keys”, “queries”, “values”, in one of the least-meaningful semantic designations we have in deep learning

And in the context of LSTMs:

> throwing in some other names, like “forget gate”, “input gate”, “output gate” for good measure

This makes me feel more confident about actually understanding these topics. Before, I was totally misled by the awkward terminology.


It gets even worse when you see how the ML community likes to bastardize terms from neuroscience.

But I think it is good to have memorable names that one can use to talk about the concepts verbally.


simply operator overloading


> “forget gate”, “input gate”, “output gate”

These are legit:

cell_h = cell_(h-1) * forget_gate + tanh(linear(input_h)) * input_gate

out_h = cell_h * output_gate

see? forget_gate masks input by multiplying with numbers in [0, 1], input_gate controls the external input, output_gate controls the output of course


The names make sense to me once I understood what they represent. How else would you like to call them?


I understand the terminology in the context of the original papers, but to me the metaphors don't seem to generalize well, or at least not in the suggested direction.

This is probably just an unfortunate situation, due to progressive understanding. Pointing this out in the slides gave me a sense of relief.

Personally, I wouldn't put names to every minor part of an algorithm or formula that was discovered to work empirically. But then again, I haven't discovered anything, and the authors of the respective papers certainly deserve some credit for their inventions!


I also enjoy how ML likes to twist statistical nomenclature enough to be irritating.


I find open educational resources just so dang heartwarming.


This is a particular unique course offering introduction on ML compilation and deployment :)


I really liked the style of the instructor (Kolter), and the reason I like this course very much is because each lecture is followed by an implementation video along with the notebook file.

In most Deep Learning courses, the implementation is left to TAs and neither recorded nor made available. This course is an exception. Another bright exception is NYU Deep Learning course [0] by Yann LeCun and Alfredo Canziani. In that course, too, all recitations ("Practica") are recorded and made available. And Canziani is a great teacher.

[0]: https://atcold.github.io/pytorch-Deep-Learning


I also really like the instructor for this course!

Seems like he really cares. I looked him up and I guess he was a student of Andrew Ng (the legendary ML lecturer!!) so it makes sense.


Thanks, this is a wonderful recommendation


very nice! I'm also a big fan of the vu amsterdam deep learning lectures on youtube. less systems focus but a really good intro to modern neural network based ML.



Are they going to offer this course again this Fall? I think you have to sign up in order to submit assignments so I'd like it if they offered it again soon.


Excited to see the MLSys growing.

Deep learning methods are so computationally intensive, many advances have come through new algorithms and optimization methods.


Took this class when it was offered for the first time when I was at CMU--it's a really great course and well organized!


This looks good as it cover’s hardware acceleration which is a gap in my knowledge I would like to start to understand




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: