Hacker News new | past | comments | ask | show | jobs | submit login

"Everyone just says scaling hypothesis. Everyone neglects to ask, what are we scaling?" [Sutskever] said.

Any guesses?




The conventional teaching that I am aware of says that you can scale across three dimensions: data, compute, parameters. But Ilya's formulation suggests that there may be more dimensions along which scaling is possible.


That's not how I read it. The scaling may still be those parameters, but the object (the "what" that is subjected to scaling) may need to retain some characteristics as it scales.

In other words, there may be a need to retains some sorts of symmetries or constraints from generation to generation that others understand less well than him (or so he thinks).


You also need to scale data. Since OpenAI has basically exhausted all available text data, there is something to this.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: