Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just for some callibration: approx. no one runs 32 bit for LLMs on any sort of iron, big or otherwise. Some models (eg DeepSeek V3, and derivatives like R1) are native FP8. FP8 was also common for llama3 405b serving.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: