Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You still have to pay for the memory. The Cerebras chip is fast because they use 700x more SRAM than, say, A100 GPUs. Loading the whole model in SRAM every time you compute one token is the expensive bit.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: