Hacker News new | past | comments | ask | show | jobs | submit login

No company can afford to spend $100B on something that will be obsolete a year later, you just can't recover the investment from sales that quickly.

$100m is manageable, if you've got 100m paying subscribers or companies using your API for a year you can recoup the costs, but there aren't many companies with 100m users to monetise for it. $1B feels like it's pushing it, only a few companies in the world can monetise, and realistically it's about lasting through the next round to be able to continue competing, not about making the money back.

$100B though, that's a whole different game again. That's like asking for the biggest private investment ever made, for capex that depreciates at $50B a year. You'd have to be stupid to do it. The public markets wouldn't take it.

Investing that much in hardware that depreciates over 5+ years and is theoretically still usable at the end, maybe, but even then the biggest companies in the world are still spending an order of magnitude less per year, so the numbers end up working out very differently. Plus that's companies with 1B users ready to monetise.




The business model here is the same as semi-conductor fabrication/design. 2022 kick started the foundation model race, teams were readily able to raise 5-25 MM to chase foundation models. In early 2024, several of those teams began to run out of money due to the realization that competitive modeling efforts were in the 1-10 Billion dollar range.

If a 100 Billion dollar training run produces the highest quality model in the land across all metrics and capabilities. That will be the model thats used, at most there would be 1-2 other firsm willing to spend 100 Billion to chase the market.


This. The seemingly neverending run for foundation models only works as long as companies can afford it. If one of them spends 100+B, it will be a long time before compute catches up to the point that a competitor could reproduce it at reasonable budgets. This is essentially the race of who's going to own AGI and it shouldn't be surprising that people are willing to spend these amounts.


Given how quickly AI is progressing from the software side, and how poorly AI scales from just throwing raw compute time at a model, I don't see a company holding onto the lead for very long with that strategy.

If I can come out with a model a year later, and it can provide 95% of the performance while costing 10% as much to run, I think I would end up stealing a lot of customers before they had a chance to break even.

Take Llama3-8B for example, this is an 8 billion parameter model from 2024 that performs about as well the the original ChatGPT, a 175 billion parameter model from 2022. It only took 2 years before a model that can run on a desktop could compete with a model that required a data center.


LLMs actually scale extremely well just by throwing compute at them. That's the whole reason they took off. Training a bigger model or training it longer or increasing the dataset all work more or less equally well. Now that we've saturated the dataset component (at least for human written text) pretty much, everyone throws their compute at bigger models or more epochs.


It's totally reasonable to take both bets. It's unclear that the company betting 100B wouldn't also be the company making the 1 MM bet.

If you're MSFT - you don't care who wins as long as you have cost competitive rights to embed the AI in all of your products - earlier than others.


> Investing that much in hardware that depreciates over 5+ years and is theoretically still usable at the end, maybe

Isn’t that exactly what’s happening?

A $300k 8x H100 pod with 5kW power supply burns at most $6k per year in power at $0.15/kWh. The majority of the money is going to capital equipment for the first time in the software industry in decades.

These top of the line chips last for much longer in the depreciation game. The A100 was released in 2020 but cloud providers still have trouble meeting demand and charge a premium for them.


How are you getting 5kW?

NVIDIA claim 10.2kW for a DGX H100 pod. https://docs.nvidia.com/dgx/dgxh100-user-guide/introduction-....

Your point still stands though where power is a fraction of the cost.

The bigger issue is power + cooling and how many units are needed to train the better models.


Serves me right for doing back of napkin math instead of just looking it up. The SXM4 A100s I use have a TDP of 400W and I hadn't realized the higher power H100 are up to 800W.


That’s true for AI, but it is not the right way to think about AGI.

For AGI, the bet is that someone will build an AI capable enough to automate AI development. Once we get there it will pay for itself. The question is what the cost-speed tradeoff to get there looks like.


Pay for itself? Who will pay for this? I don’t think you realize how much $100B is. To put it in perspective, a cutting edge fab costs almost $10B (TSMC) and only three companies can barely afford that.


"Pay for itself" is probably a bit misleading. The development of transformative AGI, under the control of a single company, has the potential to render the concept of "getting your investment back" quaint and obsolete for that company. They'd never say it, and maybe some of them don't believe it, but they're effectively competing to be in control of the future of civilization. (Meanwhile, the amount of CO2 they're burning in their bid is actively reducing the chances of such a future existing...)


Well the total world economy is in the $100T region, so I'd expect the argument is that if you only need to replace a small fraction of that with AI to generate an ROI.


I agree with you. Large tech corporations are making big bets to reach AGI first. As an example, if you are CEO of Google, do you want Microsoft or Meta to achieve AGI first.

This seems less like doing business as usual and more like betting big to be part of something really transformative.


for 100B they would probably want a realistic description of how they get to AGI. Thats a bit too much money for the handwavy answers we have right now for the path between LLMs and AGI (which doesn't even have a great definition)


it's not a "once", it's an "if"

it may never happen, especially with this current approach

at which point you've burnt hundreds of billions of dollars, emitted millions of tonnes of CO2 and all you've got out of it is a marginally better array of doubles


Yeah I think there's too much faith based investment going on right now. It all smells like the argument that bitcoin would be the future of money so the price doesn't matter just buy and hold and once it takes over you would be part of the owning class.


automating “development” does not necessarily lead to AGI. An LLM could make minor efficiency improvements all day long and still not change the fundamental approach.


I agree and do not think any company would make that investment directly. Nvidia selling to Microsoft renting to OpenAI, I'm sure you could make that add up to $100B on paper. In the long run the economics are likely much more complicated and consist of "agreements worth $x".


Even if they did, they would be the largest target for hackers or corporate espionage. I would find it hard to believe, that they would get any sort of good return on this before it was all over the internet, or at least in the hands of several competitors.


TSMC spends ~$30B every 2 years




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: