Like AI, there were exciting classes of applications in the 70s, 80s and 90s tha...

lukeschlather · 2025-02-01T16:28:16 1738427296

The thing is I'm not that interested in running something that will run on a $4K rig. I'm a little frustrated by articles like this, because they claim to be running "R1" but it's a quantized version and/or it has a small context window... it's not meaningfully R1. I think to actually run R1 properly you need more like $250k.

But it's hard to tell because most of the stuff posted is people trying to do duct tape and bailing wire solutions.

mechagodzilla · 2025-02-01T18:36:20 1738434980

I can run the 671B-Q8 version of R1 with a big context on a used dual-socket Xeon I bought for about $2k with 768GB of RAM. It gets about 1-1.5 tokens/sec, which is fine to give it a prompt and just come back an hour or so later. To get to many 10s of tokens/sec, you would need >8 GPUs with 80GB of HBM each, and you're probably talking well north of $250k. For the price, the 'used workstation with a ton of DDR4' approach works amazingly well.

adenta · 2025-02-01T16:40:47 1738428047

If you google, there is a $6k setup for the non-quantized version running like 3-4 tps.

handzhiev · 2025-02-01T16:43:33 1738428213

Indeed, even design and prepress required quite expensive hardware. There was a time when very expensive Silicone Graphics workstations were a thing.