Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Also are all values equal in importance?

Back to the Shannon question at hand (slightly answered in my next answer).

> Also why use floats at all? Integer multiplication is cheap.

Gaussianality, and they cost about the same where we're using them in current GPGPUs/tensorcores (though if Horace He steps in and corrects me on some detail of this/etc I'll gladly defer).

> are some ranges of values more important than others?

See above, also range is a good way to keep from NaNs without the overhead of NaN checking steps. Think of it as a savings account for a rainy day of capacity.

> The reason for having more bits is having large numbers of incoming or outgoing connections.

This is good intuition, though the network survives on surprisingly little precision. I had a similar feeling until one magical moment with hlb-CIFAR10 where I had to keep kicking up the quantization regularization for it to do well (for one of the older versions, at least).

> My point here is that this seems a hotly debated topic but people aren't using a lot of the type of statistical arguments I would expect for that.

I agree to a degree though in my modality of thought I would replace it with information theory since that directly informs us of a few things that we might be able to/should expect during network training. As you noted in your second to last paragraph with noise/rounding errors/etc. Which I think is good stuff.

However the empirical numbers do show pretty clearly that it works well so I'm not too sure where the need for hot debate is. RWKV is one version of a scaled model that uses it, for example. You're sort of shooting yourself in the foot with not using it these days with GPU memory being the way it is. 2x flat memory boost (for model weights) is so huge, even if it's just for I think memory transfers. Lots of networks are memory-bound these days unfortunately.

I think you have good NN-related intuition. I feel like you would find it fun to play around with (if you haven't already). Many thanks for sharing, I greatly appreciated your response. It made me think a bit, and that especially is something I value. So thank you very much for that. <3 :) :thumbsup: :thumbsup:



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: