Once this weight format war settles down, hardware can be built to support it. P... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		Animats 1 day ago \| parent \| context \| favorite \| on: Lossless LLM compression for efficient GPU inferen... Once this weight format war settles down, hardware can be built to support it. Presumably you want matrix multiply hardware optimized for whatever weight format turns out to be reasonably optimal.

eoerl 1 day ago [–]

Optimization is post hoc here : you have to train first to be able to huffman en ode, so it's not a pure format question

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact