This is pretty useless in any case that doesn’t involve BFloat16 models | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		hchja 1 day ago \| parent \| context \| favorite \| on: Lossless LLM compression for efficient GPU inferen... This is pretty useless in any case that doesn’t involve BFloat16 models

spindump8930 1 day ago | [–]

bf16 is the defacto default datatype and distribution type for LLMs, which are then often eagerly quantized by users with more limited hardware. See the recent Llama releases and e.g. the H100 spec sheet (advertised flops and metrics target bf16).

throwaway314155 1 day ago | [–]

So an increasingly smaller number of cases?

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact