Just for some callibration: approx. no one runs 32 bit for LLMs on any sort of i...

		mmoskal 3 months ago \| parent \| context \| favorite \| on: Qwen3: Think deeper, act faster Just for some callibration: approx. no one runs 32 bit for LLMs on any sort of iron, big or otherwise. Some models (eg DeepSeek V3, and derivatives like R1) are native FP8. FP8 was also common for llama3 405b serving.