Hacker News new | past | comments | ask | show | jobs | submit login

If building a new personal computer where one design goal is running these and future models (inference only) at decent speeds what should be prioritized? Getting huge ram (256gb?)? Fast ram? Slow big vram gpu vs faster gpu with less vram? Some super fast ssd?





That's actually a wonderful question! In terms of trends, I see the following:

1. GPUs are most likely at their limit in terms of FLOPs - float4 / FP4 is most likely the "final" low precision data-type. NVIDIA might provide 1.58bit support or FP2, but unlikely. If there was FP2, it might make it 1.5x faster.

2. Shrinking transistors might still have some room to go, but don't expect 2x or 4x faster - the majority of speedups was in tensor cores and low bit representation.

3. We might get more interesting transformer archs since DeepSeek showcased their unique arch for R1 / V3.

Due to these, I would first wait and see - ie I would actually wait until the next OSS model release say Llama 3, Gemma 3, etc and see what the large model labs are focusing on, then maybe I would wait for RTX 50 Super or a cheaper version or even RTX 60x series. Larger VRAM is always better.

SSD is good, but not that important - RAM is more important.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: