I seem to recall that there a recent theory paper that got a best paper award, b...

DHRicoF · 2024-12-02T16:27:18 1733156838

I don't see that contuer-intuitive at all. If you have a barrier in your cost function in 1d model you have to cross over it no matter what. In 2d it could be only a mount that you can go around. More dimensions mean more ways to go around.

fasa99 · 2024-12-03T20:40:59 1733258459

This is also how the human brain works. A young babby will have something more similar to a fully connected network. Versus a Biden type elderly brain will be more of a sparse minimally connected feed forward net. The question is (1) can this be adjusted dynamically in silico and (2) if we succeed in that, does fine-tuning still work?

scotty79 · 2024-12-05T20:13:12 1733429592

You don't have to compare to old age. Even 10 year old child has its brain pruned immensely when compared to its babyself.

Scene_Cast2 · 2024-12-02T16:19:24 1733156364

The lottery ticket hypothesis paper from 2018?

danielmarkbruce · 2024-12-02T18:47:22 1733165242

Seems this way. Gigantic model, hit the jackpot, prune the nonsense. It doesn't seem like smaller models are enough tickets.

fennecbutt · 2024-12-03T12:40:38 1733229638

I guess we can think of it like one giant funnel; it gets narrower as it goes down.

Vs trying to fill something with just a narrow tube, you spill most of what you put in.

furrypony · 2024-12-02T23:19:50 1733181590

"Train large, then compress"