The conventional teaching that I am aware of says that you can scale across three dimensions: data, compute, parameters. But Ilya's formulation suggests that there may be more dimensions along which scaling is possible.
That's not how I read it. The scaling may still be those parameters, but the object (the "what" that is subjected to scaling) may need to retain some characteristics as it scales.
In other words, there may be a need to retains some sorts of symmetries or constraints from generation to generation that others understand less well than him (or so he thinks).
Any guesses?