Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
hchja
1 day ago
|
parent
|
context
|
favorite
| on:
Lossless LLM compression for efficient GPU inferen...
This is pretty useless in any case that doesn’t involve BFloat16 models
spindump8930
1 day ago
|
next
[–]
bf16 is the defacto default datatype and distribution type for LLMs, which are then often eagerly quantized by users with more limited hardware. See the recent Llama releases and e.g. the H100 spec sheet (advertised flops and metrics target bf16).
reply
throwaway314155
1 day ago
|
prev
[–]
So an increasingly smaller number of cases?
reply
Join us for
AI Startup School
this June 16-17 in San Francisco!
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: