Oh I also upload Q8_K_XL for eg, which will upcast important layers to BF16 / F1...

DrPhish · 2025-07-25T21:57:41 1753480661

Thanks Daniel. I know you upload them, but I was hoping for some solid numbers on your dynamic q8 vs a naive quant. There doesn't seem to be anything on either of those links to show improvement at those quant levels.

My gut feeling is that there's not enough benefit to outweigh the risk of putting a middleman in the chain of custody from the original model to my nvme.

However, I can't know for sure without more testing than I have the time or inclination for, which is why I was hoping there had been some analysis you could point me to.