Any clues to how they plan to invest $500 billion dollars? What infrastructure a...

burnte · 2025-01-21T22:43:11 1737499391

That was literally my question. Is this basically just for more datacenters, NVidia chips, and electricity with a sprinkling of engineers to run it all? If so, then that $500bn should NOT be invested in today's tech, but instead in making more powerful and power efficient chips, IMO.

kristianp · 2025-01-22T01:27:11 1737509231

Nvidia and TSMC are already working on more powerful and efficient chips, but the physical limits to scaling mean lots more power is going to be used in each new generation of chips. They might improve by offering specific features such as FP4, but Moore's law is still dead.

pillefitz · 2025-01-22T20:16:04 1737576964

And we are, running on a 20W brain.

bitmasher9 · 2025-01-21T22:50:01 1737499801

I don’t know if $500bn could put anyone ahead of nvidia/tmc.

amluto · 2025-01-22T00:06:26 1737504386

$500bn of usefully deployed engineering, mostly software, seems like it would put AMD far ahead of Nvidia. Actually usefully deploying large amounts of money is not so easy, though, and this would still go through TSMC.

entropicdrifter · 2025-01-21T23:09:00 1737500940

Nvidia's in on it, so presumably this is a doubling-down on Nvidia as the chip developers

bdangubic · 2025-01-22T00:27:29 1737505649

if only $500bn was enough to make more powerful and power efficient chips…

Havoc · 2025-01-22T01:03:44 1737507824

Add some nuclear power and you’ve suddenly got a big bill

burnte · 2025-01-22T19:48:01 1737575281

Not really. Plant Vogtle in Georgia was way over budget and still was "only" $35bn. $500bn could get you 14 of those.

patall · 2025-01-21T22:52:33 1737499953

He wanted to do that, but would have needed 5T for that. Only got 100 bn so far, so this is what you get (only slightly /s)

TrainedMonkey · 2025-01-21T22:44:12 1737499452

I'll make a wild guess that they will be building data centers and maybe robotic labs. They are starting with 100B of committed by mostly Softbank, but probably not transacted yet, money.

> building new AI infrastructure for OpenAI in the United States

The carrot is probably something like - we will build enough compute to make a supper intelligence that will solve all the problems, ???, profit.

K0balt · 2025-01-22T00:28:36 1737505716

If we look at the processing requirements in nature, I think that the main trend in AI going forward is going to be doing more with less, not doing less with more, as the current scaling is going.

Thermodynamic neural networks may also basically turn everything on its ear, especially if we figure out how to scale them like NAND flash.

If anything, I would estimate that this is a space-race type effort to “win” the AI “wars”. In the short term, it might work. In the long term, it’s probably going to result in a massive glut in accelerated data center capacity.

The trend of technology is towards doing better than natural processes, not doing it 100000x less efficiently. I don’t think AI will be an exception.

If we look at what is -theoretically- possible using thermodynamic wells, with current model architectures, for instance, we could (theoretically) make a network that applies 1t parameters in something like 1cm2. It would use about 20watts, back of the napkin, and be able to generate a few thousand T/S.

Operational thermodynamic wells have already been demonstrated en silica. There are scaling challenges, cooling requirements, etc but AFAIK no theoretical roadblocks to scaling.

Obviously, the theoretical doesn’t translate to results, but it does correlate strongly with the trend.

So the real question is, what can we build that can only be done if there are hundreds of millions of NVIDIA GPUs sitting around idle in ten years? Or alternatively, if those systems are depreciated and available on secondary markets?

What does that look like?

pillefitz · 2025-01-22T20:20:06 1737577206

What is a thermodynamic well? Couldn't find much on it.

K0balt · 2025-01-23T02:05:38 1737597938

Extropic (and others) are working on it. It’s a very fast and efficient way to do the big math and state problems associated with LLMs and ML in general. It does the complex matrix algebra in a single “gate” as an analog system.

Extropic update on building the ultimate substrate for generative AI https://twitter.com/Extropic_AI/status/1820577538529525977

disambiguation · 2025-01-21T23:24:29 1737501869

Yachts, mansions, private jets, maybe some very expensive space heaters.

croddin · 2025-01-21T22:48:50 1737499730

This could be a clue

https://x.com/sama/status/1756090136935416039

jppope · 2025-01-21T23:09:16 1737500956

Reasonably speaking, there is no way they can know how they plan to invest $500 billion dollars. The current generation of large language models basically use all human text thats ever been created for the parameters... not really sure where you go after than using the same tech.

Philpax · 2025-01-21T23:16:13 1737501373

That's not really true - the current generation, as in "of the last three months", uses reinforcement learning to synthesize new training data for themselves: https://huggingface.co/deepseek-ai/DeepSeek-R1-Zero

bandrami · 2025-01-21T23:49:51 1737503391

It worked well for the Habsburg family; what could go wrong?

XorNot · 2025-01-21T23:44:49 1737503089

Right but that's kind of the point: there's no way forward which could benefit from "moar data". In fact it's weird we need so much data now - i.e. my son in learning to talk hardly needs to have read the complete works of Shakespeare.

If it's possible to produce intelligence from just ingesting text, then current tech companies have all the data they need from their initial scrapes of the internet. They don't need more. That's different to keeping models up to date on current affairs.

throwaway4aday · 2025-01-22T00:09:53 1737504593

That's essentially what R1 Zero is showing:

> Notably, it is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.

YetAnotherNick · 2025-01-22T05:56:42 1737525402

O3 high compute requires 1000s of dollars to solve one medium complexity problem like ARC.

artificialprint · 2025-01-22T22:26:45 1737584805

Light bulbs used to be expensive too, nails as well.

rapjr9 · 2025-01-22T08:17:03 1737533823

The latest hype is around "agents", everyone will have agents to do things for them. The agents will incidentally collect real-time data on everything everyone uses them for. Presto! Tons of new training data. You are the product.

cavisne · 2025-01-21T23:54:34 1737503674

The new scaling vector is “test time compute” ie spending more compute in inference.

jazzyjackson · 2025-01-21T23:25:37 1737501937

It seems to me you could generate a lot of fresh information from running every youtube video, every hour of TV on archive.org, every movie on the pirate bay -- do scene by scene image captioning + high quality whisper transcriptions (not whatever junk auto-transcription YouTube has applied), and use that to produce screenplays of everything anyone has ever seen.

I'm not sure why I've never heard of this being done, it would be a good use of GPUs in between training runs.

jensvdh · 2025-01-21T23:48:03 1737503283

The fact that OpenAI can just scrape all of Youtube and Google isn't even taking legal action or attempting to stop it is wild to me. Is Google just asleep?

bdangubic · 2025-01-22T00:31:09 1737505869

what are they going to use to sue - DMCA? OpenAI (and others) are scraping everything imaginable (MS is scraping private Github repos…) - don’t think anyone in the current government will be regulating any of this anytime soon

lanstin · 2025-01-22T01:25:12 1737509112

Such a biased source of data-that gets them all the LaTeX source for my homeworks, but not my professor's grading of the homework, and not the invaluable words I get from my professor at office hours. No wonder the LLMs have bizarre blindnesses in different directions.

bdangubic · 2025-01-22T14:00:23 1737554423

Such a biased source of data-that gets them all the LaTeX source for my homeworks

but also myriad of hardcore private repositories of many high-tech US enterprises hacking amazing shit (mine included) :)

lanstin · 2025-01-23T04:19:40 1737605980

One reason LLMs are better at commercial coding than maths.

airstrike · 2025-01-21T23:37:27 1737502647

Don't forget every hour of news broadcasting, of which we likely won't run out any time soon. Plus high quality radio

ilaksh · 2025-01-22T00:21:55 1737505315

I think that this is the obvious path to more robust models -- grounding language on video.

miltonlost · 2025-01-21T23:45:08 1737503108

> a lot of fresh information from running every youtube video

EVERY youtube video?? Even the 9/11 truther videos? Sandy Hook conspiracy videos? Flat earth? Even the blatantly racist? This would be some bad training data without some pruning.

lanstin · 2025-01-22T01:46:17 1737510377

The best videos would be those where you accidentally start recording and you get 2 hours of naturalistic conversation between real people in reality. Not sure how often they are uploaded to YouTube.

Part of the reason that kids need less material is that the aren't just listening, they are also able to do experiments to see what works and what doesn't.

riku_iki · 2025-01-21T23:57:58 1737503878

I think there is huge amount of corporate knowledge.

layer8 · 2025-01-21T23:42:59 1737502979

I’m more interested in how they plan to draw the rest of the damn owl.

lukeplato · 2025-01-21T22:48:35 1737499715

hopefully nuclear power plants

HarHarVeryFunny · 2025-01-22T00:12:05 1737504725

They are going to buy 50 $10B nuclear aircraft carriers and use them as a power source.

MangoCoffee · 2025-01-21T22:40:51 1737499251

data center + gpu server farm (?)

mrandish · 2025-01-21T22:44:32 1737499472

Plus power plants to drive the massive data centers. At large enough scale, power availability and cost is a constraint.

paulnpace · 2025-01-22T00:51:32 1737507092

Congress.