"one small step at a time, and one giant leap, together."
I didn't like this part:
5090 for $2000, about $500 more than 4090 when it was announced.
They didn't mention VRAM amount though, and I doubt it's more than 24GB. If Apple M4 Ultra gets close to 1.8 TB/s bandwidth of 5090, it'll crush GeForce once and for all (and for good).
Also nitpick: the opening video said tokens are responsible for all AI, but that only applies to a subset of AI models...
When you have a retail price so far below "street" price, it just makes it harder to obtain and scalpers take a bigger cut. Raising the price to something more normal at least gives you more of a chance at the big-box store.
Or scalpers won’t be dissuaded and street price for a 5090 will be $3200 or more. $1500 was already an insane price tag for scalpers to pay but they did it anyways.
The scalpers are trying to arbitrage the difference in price between the prices bought directly from suppliers and those on the open secondary market.
increasing the retail price doesn't increase the price on the secondary market, it just lowers the margin of scalpers.
Well for one thing it’s a lot easier to communicate expectations to consumers at CES.
“Coming soon to auction at a price set by users, no idea what that will be though, good luck!” is much less compelling for consumers trying to plan their purchases in advance.
Being able to decide a price and who you sell your product to is a huge leverage. Nvidia can go to a retailer selling something they don't like to be side by side on a shelf, hey ditch this and we will make you a price. It is never that overt of course and it can play geopolitically too, hey government you want chips? We have chips and it would be a shame if the market grabs them before you, BTW don't forget my tax cut.
If anyone in this thread had watched the linked video or even read a summary, they ought to be at least talking about the DIGITS announcement.
128GB of VRAM for $3000.
Slow? Yes. It isn't meant to compete with the datacenter chips, it's just a way to stop the embarrassment of being beaten at HPC workstations by apple, but it does the job.
Jensen didn't talk about it. Maybe he knows it's embarrassingly low. Everyone knows nvidia won't give us more VRAM to avoid cannibalizing their enterprise products.
The official specs for the 5090 have been out for days on nvidia.com, and they explicitly state it's 32GB of GDDR7 with a 512-bit bus, for a total of 1.8TB/s of bandwidth.
This feels like a weird complaint, given you started by saying it was 24GB, and then argued that the person who told you it was actually 32GB was making that up.
All else equal, this means that price per GB of VRAM stayed the same. But in reality, other things improved too (like the bandwidth) which I appreciate.
I just think that for home AI use, 32GB isn't that helpful. In my experience and especially for agents, models at 32B parameters just start to be useful. Below that, they're useful only for simple tasks.
Yes, home / hobbyist LLM users are not overly excited about this, but
a) they are never going to be happy,
b) it's actually a significant step up given the bulk are dual-card users anyway, so this bumps them from (at the high end of the consumer segment) 48GB to 64GB of VRAM, which _is_ pretty significant given the prevalence of larger models / quants in that space, and
c) vendors really don't care terribly much about the home / hobbyist LLM market, no matter how much people in that market wish otherwise.
Supposedly, this image of a Inno3D 5090 box leaked, revealing 32GB of VRAM. It seems like the 5090 will be more of a true halo product, given the pricing of the other cards.
Also nitpick: the opening video said tokens are responsible for all AI, but that only applies to a subset of AI models...