Hacker News new | past | comments | ask | show | jobs | submit login

It's true, but isn't OP also correct? Ie. it's about speed, which implies locality, which implies approaches like MoE which does exactly that and it's unlike physical brain topology?

Having said that it would be fun to see things like rearrangement data moves based on temerature of silicon parts after training cycle.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: